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This handbook provides detailed programming 
information and hardware system design informa- 
tion for the Intel 80960KB processor (which is part of the 80960K series of embedded-processor 
products) 
as well as information 
on other 32-bit microprocessors, 
peripherals 
and development 
support tools. 


Hardware designers can use this information as a guideline for developing microprocessor 
systems. 
Applications programmers, 
compiler designers, and designers of operating-system 
kernels will also 
find needed information 
on the software 
architecture, 
instruction 
set, and programming 
of the 
80960KB processor. 


All of the processors in the 80960K series of products are based on the Intel 80960 architecture. Most 
of the information 
in this handbook 
also applies to the 80960KA processor. The only difference 
between the 80960KB 
and 80960KA 
processors 
is that the 80960KA 
does not provide on-chip 
support for floating-point 
operations or operations on decimal numbers. 


Wherever 
appropriate, 
design examples 
are included. These designs are based upon functional 
80960KB boards and systems, and are simplified for ease of understanding. These simplified designs 
have not been tested except for examples that include part numbers. 


The Programmer's 
Reference provides programmers 
and system designers with detailed informa- 
tion about the processor's 
programming 
environment 
and kernel (or executive) support facilities. It 
also provides detailed reference information 
on the 80960 architecture, 
beyond that found in the 
architecture 
overview. 


The following 
is a brief overview of the contents of each section of the Programmer's 
Reference 
portion of the manual: 


Section 7 - 
Execution Environment. Describes the environment in which instructions are executed. 
The topics discussed include the address space, registers, instruction pointer, and arithmetic calls. 


Section 8 - 
Procedure Calls. Describes the various mechanisms 
available for making procedure 
calls. The topics discussed include the local call/return mechanism, procedure stack, branch-and-link 
procedure calls, procedure table calls, and supervisor call mechanism. 


Section 9 -Data 
Types and Addressing 
Modes. Describes non-floating-point 
data types and how 
bits and bytes are addressed. The addressing modes provided for addressing data in memory are also 
described in this section. 


Section 
10 - 
Instruction 
Set Summary. 
Overview 
of all non-floating-point 
instructions 
in the 
80960KB instruction set, arranged by functional groups. The assembly language instruction format 
is also described. 


Section 
11 - 
Processor 
Management 
and Initialization. 
Describes 
the processor 
management 
facilities. Included is a discussion of the system data structures required to operate the processor, the 
software 
requirements 
for processor 
management, 
and the requirements 
for physical 
memory. 
Processor Initialization 
concludes the section. 


Section 12 - 
Interrupts. Description of the interrupt mechanism, 
interrupt priority, interrupt table, 


interrupt handling procedures, 
and the software requirements 
for handling interrupts. 


Section 13 - 
Fault Handling. Describes the processor's 
fault-handling 
mechanism, 
including the 
fault-table 
structure, fault handling procedures, 
and the software requirements 
for handling faults. 


Each fault is detailed in a reference section at the end of the section. 


Section 14 - 
Debugging. Describes the debugging and monitoring support facilities, including the 
trace control register. 


Section 15 - 
Instruction Set Reference. Alphabetical 
listing of the complete 80960KB instruction 
set, with detailed 
descriptions 
of each instruction, 
assembly-language 
syntax, 
examples, 
and 
algorithms. 


Section 16 - 
Floating Point Operation. Description of the floating-point 
processing facilities of the 
processor. This section includes an overview of floating-point 
numbers as well as a description 
of 
the 80960KB floating-point 
data types and their relationship 
to the IEEE floating-point 
standard. 
Floating-point 
instructions, 
exceptions, 
and faults are also described. 


Section 
17 - 
Interagent 
Communication. 
Describes 
the interprocessor 
communication 
(lAC) 
mechanism, 
which allows several processors 
to communicate 
with one another over the bus. The 
topics discussed include the lAC mechanism 
and software requirements 
for using internal lACs. 
Each lAC is described in detail in a reference section at the end of the section. 


Appendix A- 
Instruction and Data Structure Quick Reference. Provides two lists of the 80960KB 
instructions 
- one alphabetical 
by assembly-language 
mnemonic 
and one by machine language 
opcode. 


Appendix C- 
Instruction Timing. Describes the 80960KB processor's 
instruction pipeline and its 
effect on instruction timing. Includes each instruction's 
clock cycle requirement. 


Appendix E - 
Considerations 
for Portable Software. Discusses the 80960KB architecture aspects 
that should be considered 
if code written for the 80960KB processor is intended to be ported later 
to other implementations 
of the 80960 architecture. 


The following paragraphs describe the notation style conventions used in the architectural overview 
and programmer's 
reference chapters, as well as terminology that has special meaning as used in this 


handbook. 


Integer numbers are presented in decimal format unless otherwise indicated by the subscript "H" for 
hexadecimal 
or "B" for binary. 


Certain fields in the processor's 
system data structures are described as being either reserved fields 
or preserved fields. 


A reserved field is one that other implementations 
of the 80960 architecture can use. To help ensure 
that a current 
software 
design 
will be compatible 
with future processors 
based on the 80960 
architecture, 
the bits in the reserved fields should be set to 0 when the structure is initially created. 
Thereafter, software should not access these fileds. 


Some fields in system data structures are shown as being required to be set to either 1orO. These fields 
should be treated as if they were reserved fields. They should be set to the specified value when the 
data structure is created, and should not be accessed by software after that. 


A preserved field is one that the processor does not use. Software may use preserved fields for any 
function. 


The terms set and clear are used in this manual to refer to the value of a bit field in a system data 
structure. If a bit is set, its value is 1; if the bit is clear, its value is O.Likewise, setting a bit means 
giving it a value of 1 and clearing a bit means giving it a value of O. 


The 80960KB processor introduces the 80960 architecture -a new 32-bit architecture from Intel. This 
architecture has been designed to meet the needs of embedded applications such as machine control, 
robotics, process control, avionics, and instrumentation. 


The 80960 architecture 
can best be characterized 
as a high-performance 
computing 
engine. It 
features high-speed 
instruction 
execution 
and ease of programming. 
It is also easily extensible, 
allowing processors and controllers based on this architecture to be conveniently customized to meet 
the needs of specific processing and control applications. 


intel 


The following are some of the important attributes of the 80960 architecture: 


Full 32-bit registers 


High-speed, 
pipelined instruction execution 


A convenient program execution environment 
with 32 general-purpose 
registers and a versatile 
set of special-function 
registers 


A highly optimized procedure call mechanism that features on-chip caching of local variables 
and parameters 


Extensive facilities for handling interrupts and faults 


Extensive tracing facilities to support efficient program debugging and monitoring 


Register scoreboarding 
and write buffering to permit efficient operation when used with lower 
performance 
memory subsystems. 


The central processing module, memory module, and I/O module form the natural boundaries for the 
hardware system architecture. 
The modules are connected together by the high bandwidth 
32-bit 
multiplexed L-bus, which can transfer data at a maximum sustained rate of 53M bytes per second for 
an 80960 processor operating at 20 MHz. 


Figure 1 shows a simplified block diagram of one possible system configuration. 
The heart of this 
system is the 80960B processor, 
which fetches instructions, 
executes 
code, manipulates 
stored 
information, 
and interacts with I/O devices. The high bandwidth 
L-bus connects 
the 80960KB 
processor to memory and I/O modules. The 80960KB processor stores system data, instructions, and 
programs in the memory module. By accessing various peripheral devices in the I/O module, the 
80960KB processor supports communication 
to terminals, modems, printers, disks, and other I/O 
devices. 


The 80960KB processor performs bus operations using multiplexed 
address and data signals, and 
provides all the necessary control signals. For example, standard control signals, such as Address 
Latch Enable (ALE), Address/Data 
Status (ADS), Write/Read 
Command 
(W;R), Data Transmit/ 
Receive 
(DT;R), and Data Enable (DEN), are provided 
by the 80960KB 
processor. The 80960 
processor also generates byte enable signals that specify which bytes on the 32-bit data lines are valid 
for the transfer. 


The L-bus supports burst transactions, 
which access up to four data words at a maximum rate of one 
word per clock cycle. The 80960KB processor uses the two low-order address lines to indicate how 
many words are to be transferred. The 80960KB processor performs burst transactions 
to load the 
on-chip 512-byte 
instruction 
cache to minimize 
memory accesses for instruction 
fetches. Burst 
transactions 
can also be used for data access. 
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To transfer control ofthe bus to an external bus master, the 80960KB provides two arbitration signals: 
hold request (HOLD) and hold acknowledge 
(HLDA). After receiving HOLD, the processor grants 
control of the bus to an external master by asserting HLDA. 


The 80960KB 
processor 
provides 
a flexible 
interrupt 
structure 
by using an on-chip 
interrupt 
controller, an external interrupt controller, or both. The type of interrupt structure is specified by an 
internal interrupt vector register. For a system with multiple processors, another method is available, 
called inter-agent 
communication 
(lAC) where a processor 
can interrupt 
another processor 
by 
sending an lAC message. 


A memory module can consist of a memory controller, Erasable Programmable 
Read Only Memory 
(EPROM), 
and static or dynamic Random Access Memory (RAM). The memory controller 
first 
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conditions 
the L-bus signals for memory operation. 
It demultiplexes 
the address and data lines, 


generates 
the chip select signals from the address, detects the start of the cycle for burst mode 


operation, and latches the byte enable signals. 


The memory controller generates the control signals for EPROM, SRAM, and DRAM. Specifically, 
it provides the control signals, multiplexed 
row/column 
address, and refresh control for dynamic 


RAMs. The controller can be designed to accomodate the burst transaction of the 80960KB processor 
by using the static column mode or nibble mode features of the dynamic RAM. In addition 
to 
supplying the operational signals, the controller generates the READY signal to indicate that data can 
be transferred to or from the 80960KB processor. 


The 80960KB processor directly addresses up to 40 bytes of physical memory. The processor does 
not allow burst accesses to cross a 16-byte boundary, to ease the design of the controller. Each address 
specifies a four-byte data word within the block. Individual data bytes can be accessed by using the 
four byte-enable 
signals from the 80960KB processor. Chapter 5 provides design guidelines for the 


memory controller. 
' 


The I/O module consists of the I/O components and the interface circuit. I/O components can be used 
to allow the 80960KB 
processor 
to use most of its clock cycles for computational 
and system 
management 
activities. Time consuming 
tasks can be off-loaded to specialized slave-type compo- 
nents, such as the 8259A Programmable 
Interrupt Controller or the 82530 Serial Communication 
Controller. Some tasks may require a master-type component, such as the 82586 Local Area Network 
Control. 


The interface 
circuit performs 
several functions. 
It demultiplexes 
the address 
and data lines, 


generates the chip select signals from the address, produces the I/O read or I/O write command from 
the processor's 
W/R signal, latches the byte enable signals, and generates the READY signals. Since 


some of these functions are identical to those of the memory controller, the same logic can be used 
for both interfaces. For master-type peripherals that operate on a 16-bit data bus, the interface circuit 
translates the 32-bit data bus to a 16-bit data bus. 


The 80960KB processor uses memory-mapped 
addresses to access I/O devices. This allows the CPU 
to use may of the same instructions to exchange information for both memory and peripheral devices. 
Thus, the powerful 
memory-type 
instructions 
can be used to perform 
8-, 16-, and 32-bit data 
transfers. 


Much of the design of the 80960 architecture 
has been aimed at maximizing 
the processor's 


computational 
and data processing 
speed through the use of increased parallelism. 
The following 
paragraphs describe several of the mechanisms 
and techniques 
used to accomplish 
this goal. 
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One of the more important features of the 80960 architecture is its performance 
of most operations 
on operands 
in registers, rather than in memory. For example, 
all arithmetic, 
logic, comparison, 
branching and bit operations are performed 
with registers and literals. 


This feature provides two benefits. First, it increases program execution speed by minimizing 
the 
number of memory accesses necessary to execute a program. Second, it reduces the memory latency 
encountered 
when using slower, lower-cost memory parts. 


To support this concept, the architecture provides a generous supply of general-purpose 
registers. For 
each procedure, 32 registers are available, 28 of which are available for general use. Thse registers 
are divided into two types: global and local. Both types of registers can be used for general storage 
of operands. 
The only difference 
is that global registers 
retain their contents 
across procedure 
boundaries, 
whereas the processor allocates a new set of local registers each time a new procedure 
is called. 


The architecture 
also provides a set of fast, versatile load and store instructions. These instructions 
allow burst transfers of 1, 2, 4, 8, 12, or 16 bytes of information between memory and the registers. 


To further reduce memory accesses, the architecture offers two mechanisms 
for caching code and 
data on chip: an instruction cache and multiple sets of local registers. The instruction cache allows 
prefetching 
of blocks of instruction from memory. This helps ensure that the instruction execution 
pipeline is supplied with a steady stream of instructions. 
It also reduces the number of memory 
accesses required when performing 
iterative operations such as loops. The architecture 
allows the 
size of the instruction cache to vary. For the 80960KB processor, it is 512 bytes. 


To optimize the architecture's 
procedure 
call mechanism, 
the processor provides multiple sets of 
local registers. This allows the processor to perform procedure calls without having to write the local 
registers 
out to the stack in memory. 
The number 
of register 
sets depends 
on the processor 
implementation. 
The 80960KB processor provides four sets of local registers. 


The 80960 architecture 
also enhances program execution 
speed by overlapping 
the execution 
of 
some instructions. 
In the 80960K 
series of processors, 
this is accomplished 
through 
register 
scoreboarding. 


Register scoreboarding 
permits instruction execution to continue while data is being fetched from 
memory. When a load instruction 
is executed, 
the processor 
sets one or more scoreboard 
bits to 
indicate the target registers to be loaded. After the target registers are loaded, the scoreboard bits are 
cleared. While the target registers 
are being loaded, the processor 
is allowed 
to execute other 
instructions 
that do not use these registers. 
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The processor uses the scoreboard bits to ensure that the target registers are not used until the loads 
complete. (Scoreboard bits are checked transparently 
from software.) This technique allows code to 
be executed such that some instructions can be executed in zero clock cycles (that is, executed for 
free). 


The 80960 architecture is designed to let a processor execute commonly used instructions, 
such as 
moves, adds, subtracts, 
logical operations, 
and branches, 
in a minimum 
number of clock cycles 
(preferably one cycle). The architecture supports this concept in several ways. For example, the load 
and store model described 
earlier eliminates 
the clock cycles required 
to perform 
memory-to- 
memory operations, by concentrating 
on register-to-register 
operations. 


In addition, all of the instructions 
in the 80960 architecture 
are 32 bits long and aligned on 32-bit 
boundaries. 
This lets instructions 
be decoded in one clock cycle, and eliminates 
the need for an 
instruction-alignment 
stage in the pipeline. 


The 80960KB processor takes full advantage of these features of the architecture, 
resulting in more 
than 50 instructions 
that can be executed in a single clock cycle. 


The 80960 architecture 
provides 
an efficient 
mechanism 
for servicing 
interrupts 
from external 
sources. To handle interrupts, the processor maintains an interrupt table of248 interrupt vectors, 240 
of which are available for general use. When an interrupt is signaled, the processor uses a pointer to 
the interrupt table to perform an implicit call to an interrupt handler procedure. 
In performing 
this 
call, the processor 
automatically 
saves the state of the processor prior to receiving the interrupt, 


performs the interrupt routine, then restores the state of the processor. A separate interrupt stack is 
also provided to segregate interrupt handling from application programs. 


The interrupt handling facilities also allow interrupts to be evaluated by priority. The processor is then 
able to store interrupt vectors that are lower in priority than the current processor task in a pending 
interrupt section of the interrupt table. The processor checks and services the pending interrupts at 
defined times. 


Because of its streamlined execution environment, 
processors based on the 80960 architecture 
are 
particularly 
easy to program. The following paragraphs 
describe some of the architecture 
features 
that simplify programming. 


The procedure 
call mechanism 
makes procedure calls and parameter passing between procedures 
simple and compact. Each time a call instruction is issued, the processor automatically 
automatically 


saves the current set of local registers and allocates a new set for the called procedure. Likewise, on 
a return from a procedure, the current set of local registers is deallocated and the local registers for 
the procedure being returned to are restored. This means a program never has to explicitly save and 
restore those local variables that are stored in local registers. 


The selection of instructions and addressing modes also simplifies programming. 
A full set of load, 
store, move, arithmetic, comparison, 
and branch instructions are provided, with operations on both 
integer and ordinal data types. Operations on bits and bit strings are simplified by a complete set of 
Boolean and bit-field instructions. 


The addressing 
modes are efficient 
and straighforward, 
while at the same time providing 
the 


necessary indexing and scaling modes required to address complex arrays and record structures. The 
large 4-gigabyte address space provides ample room to store programs and data. The availabilty of 
32 addressing lines allows some address lines to be memory-mapped 
to control hardware functions. 


To aid in program development, 
the 80960 architecture 
defines a wide range of faults that the 
processor 
detects, including arithmetic 
faults, invalid operations, 
invalid operands, 
and machine 
faults. Whan a fault is detected, the processor makes an implicit call call to a fault handler routine, 
in a way similar to the interrupt mechanism descrbed previously. The inf<;>rmationcollected for each 
fault allows program developers 
to quickly correct faulting code, and allows automatic 
recovery 
from some faults. 


To support 
debugging 
systems, 
the 80960 architecture 
provides 
a mechanism 
for monitoring 


processor activity by means of trace events. When the processor detects a trace event, it signals a trace 
fault and calls a fault handler. Intel provides several tools that use this feature, including an in-circuit 
emulator (ICE) device. 


The 80960 architecture provides several features that enable processors based on this architecture to 
be easily customized to meet the needs of specific embedded applications, such as signal processing, 
array processing, 
or graphics processing. 


The most important ofthese features is the set of32 special function registers. These registers provide 
a convenient interface to circuitry in the processor or pins that can be connected to external hardware. 
They can be used to control timers, to perform operations on special data types, or to perform I/O 
functions. The special function registers are similar to the global registers. They can be addressed by 
all of the register access instructions. 
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The 80960K series of processors provides a complete implementation 
of the 80960 architecture, plus 
several extensions 
to that architecture. 
These extensions 
fall into two categories: 
floating-point 
processing 
and interagent communication. 


The 80960KB 
processor 
provides 
a complete 
implementation 
of the IEEE standard for binary 
floating-point 
arithmetic (IEEE 754-185). This implementation 
includes a full set of floating point 
operations, 
including 
add, subtract, 
multiply, 
divide, trigonometric 
functions, 
and logarithmic 
functions. These operations are performed on single precision (32-bit), double precision (64-bit), and 
extended precision (80-bit) real numbers. 


One of the benefits ofthis implementation 
is that the floating-point 
handling facilities are integrated 
into the normal instruction execution environment. 
Single and double precision floating-point 
values 
are stored in the same registers as non-floating point values. Four 80-bit floating-point 
registers are 
provided to hold extended-precision 
values. 


All of the processors in the 80960K series provide an interagent communication 
(lAC) mechanism, 


allowing agents connected to the processor's 
bus to communicate 
with one another. This mechanism 
operates 
similarly 
to the interrupt 
mechanism, 
except 
that lAC messages 
are passed 
through 
dedicated 
sections 
of memory. 
The sort of tasks handled 
with lAC 
messages 
are processor 
reinitialization, 
stopping the processor, purging the instruction cache, and forcing the processor to 
check pending interrupts. 


80960KB Architectural 
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The 80960KB is the first 32-bit microprocessor 
designed especially for embedded applications. At 
an operating 
frequency 
of 20 MHz, this high performance 
processor 
can sustain an instruction 
execution rate of seven and one-half million instructions per second (MIPS), and burst rates of 20 
MIPS. The 80960KB 
processor 
enhances 
embedded 
system performance 
by integrating 
special 
features to eliminate the need for additional peripheral devices and the associated software overhead. 
For instance, the 80960KB processor offers an on-chip floating-point 
processing unit, an improved 
interrupt handling capability, and support for debugging 
and tracing. This chapter describes 
the 
architectural 
attributes and enhancements 
of the 80960KB processor for embedded computing. 


For over a decade, Intel has designed a large variety of 8- and 16-bit microcontrollers 
to fit the needs 
of embedded applications. Based on this experience, 
several architectural 
attributes shared by both 
microcontrollers 
and microprocessors 
can be implemented 
that benefit embedded applications and 
enhance microprocessor 
performance. Because the 80960KB processor incorporates these attributes 
(listed below) in its architecture, embedded applications are easy to design, perform well, and get to 
market fast. 


Simple load/store design 


Large general-purpose 
register sets 


Boolean and bit-field instructions 


Small number of operations and addressing modes 


Simplified instruction format 


Minimum cycle operation 


In the 80960 family architecture, 
operations are register-to-register, 
with only LOAD and STORE 
instructions accessing memory. This attribute simplifies the instruction set and shortens cycle time. 
The 80960KB processor uses LOAD and STORE instructions to access memory. It further minimizes 
accesses to memory by providing 
a 512-byte, direct-mapped 
instruction 
cache. When a memory 
access is required, the processor can perform a burst transaction that accesses up to four data words 
with one word transferred every clock cycle. 


Because the instructions operate on operands within registers, the 80960 family uses many registers. 
The 80960KB 
processor 
features 
large, versatile 
register 
sets. For maximum 
flexibility, 
each 
processor provides 32 32-bit registers and four 80-bit floating-point 
registers. 


There are two types of general-purpose 
registers: 
local and global. The processor 
automatically 
accesses the 16 local registers when a procedure call is performed. Multiple sets of local registers are 
stored on-chip to further increase the efficiency of this register set, as shown in Figure 1.The register 
cache holds up to four local register frames, which means that up three procedure calls can be made 
without having to access the procedure stack resident in memory. 


The 20 global registers retain their contents across procedure boundaries. The global registers consist 
of sixteen 32-bit registers (G 15 through Go) and four 80-bit registers (FP 3 through FP 0) as shown in 
Figure 2. While all registers can be used for floating-point 
operations, the 80-bit registers are used 
for accumulation 
of extended precision results. 


The 80960 family uses relatively few addressing modes to facilitate a fast, simple interpretation 
by 
the control engine. The 80960KB processor provides simple, fast addressing modes, as well as a few 
complex addressing modes to allow optimizations 
for code density. 


A simplified instruction format eases the hardwired decoding of instructions, 
which again speeds 
control paths. The 80960KB 
processor's 
instruction 
formats 
are simple and word aligned; 
all 
instructions 
are one word long except for one class that uses the subsequent 
word as a 32-bit 
displacement. 
To further enhance performance, 
the instructions do not cross word boundaries. This 
feature eliminates a pipeline stage (that would have to align instructions) 
and decreases instruction 
execution time. 
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NOTE: 
'ANY REGISTER 
CAN BE USED FOR FLOATING-POINT 
OPERATIONS. 
THE SO-BIT 
REGISTERS 
ARE PROVIDED 
FOR EXTENDED 
PRECISION 
ACCUMULATION. 


To optimize performance, 
the 80960KB processor overlaps instruction execution by means of write 
buffering and register scoreboarding. 
Write buffering allows a write instruction to proceed as soon 
as it is placed in the buffer. It does not have to wait for the actual write operation to occur on the L- 
bus. 


Similarly, 
register 
scoreboarding 
is a design 
technique 
that allows the 80960KB 
to continue 
execution 
of instructions 
when it encounters 
a LOAD instruction. 
When the LOAD instruction 
begins, the 80960KB sets a scoreboard bit on the target register. After the target register is loaded with 
data. the processor resets the bit. While the data is being retrieved, additional instructions that do not 
reference the target register can be executed. 


The 80960KB 
ensures that these additional 
instructions 
do not reference 
the target register by 


checking the scoreboard transparently 
(no software required). Thus, the scoreboard feature reduces 


the effect of slow memory speed and provides a useful tool for optimizing procedures. 
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The 80960KB processor executes most of the core instructions 
in a single clock cycle. For these 
instructions, 
the 80960KB 
processor 
uses hardwired 
logic rather than microcode 
to execute the 
instruction. 


The 80960KB also supports a number of important multicycle instructions, 
such as 32-bit multiply 
and divide instructions. 
These auxiliary functions require more than one clock cycle because it is 
more efficient to use microcode than hardwired logic. On the other hand, the integration 
of these 
functions on-chip eliminates much software overhead and the negative effects on code density that 
would be otherwise required. Thus, the additional functionality 
of the 80960KB enhances overall 
system performance 
while keeping code size small. 


The 80960KB incorporates two useful features: an on-chip floating-point 
processing and debugging 
functions. The floating-point 
unit can be used for applications that require precision such as machine- 
control operations. The debugging function significantly 
decreases development 
time. 


The on-chip 
floating-point 
unit of each processor 
improves 
the performance 
of floating-point 
calculations 
by eliminating bus overhead used to transfer operands to a coprocessor. The processor 
provides hardware support for both mandatory and recommended 
portions ofIEEE standard 754 for 
floating-point 
arithmetic, exponential, logarithmic, and other transcendental 
functions. By integrat- 
ing the floating-point 
unit on-chip, the 80960KB 
processor 
reduces the overall chip count for a 
system, decreases power consumption, 
and increases overall performance 
and reliability. 


The processor 
provides extensive 
system debug capabilities, 
an important feature for embedded 
computing 
where the ability to instrument an application may be limited. The 80960KB processor 
allows breakpoint instructions that stop program execution on variousevents, 
such as procedure calls, 


or certain instructions. Another debug facility traces the activity of the processor while it is executing 
a program. Tracing is done by recording the addresses of instructions that cause trace events to occur. 
For example, a trace event can occur on the execution of a specific instruction, branch, or procedure 
call. To ensure that the 80960KB is operating properly, the processor performs a self-test when it is 
reset. If the self-test is successful, the 80960KB begins operation, otherwise 
it enters the stopped 
state. 


The advanced features of the 80960KB processor are implemented 
using a performance 
optimized 
bus interface. The processor uses a high bandwidth local bus (L-bus) that consists of standard signal 


intel 


groups: a 32-bit multiplexed address/data path and control signals for data transactions. 
Because of 
the large amount of caching, the L-bus supports burst transactions that transfer up to four successive 
data words. Transactions on the L-bus can use 8-, 16-, and 32-bit data types and address up t04G bytes 
of physical memory. Bus arbitration can be accomplished 
by simply using the hold request/hold 
acknowledge 
protocol. 


The 80960KB processor offers a flexible way to manage interrupts. It accepts interrupts in one of 
three ways: by communicating 
with an external interrupt controller 
using "the standard Interrupt/ 
Interrupt Acknowledge 
signals, by activating 
the on-chip interrupt controller, or by accepting 
an 
inter-agent communication 
(lAC) message. This allows the 80960KB to act as a coprocessor 
on a 
shared bus with another CPU. 


The 80960KB processor optimizes embedded system performance 
by using a new 32-bit architec- 
ture. The 80960 family architecture includes a load/store design, large general purpose (register sets, 
fast addressing modes, a simplified instruction format, and minimized instruction execution cycles. 


To further enhance system performance, 
the 80960KB processor provides floating-point 
operation, 
interrupt controller capabilities, 
and debug functions. By intergrating 
these functions on-chip, the 
80960KB reduces the power requirements 
and overall chip count for a system. 


As a result of the 80960 architecture, the 80960KB processor provides unprecedented 
performance. 
For a speed selection of 20 MHz, it can sustain an instruction execution rate of over seven and one- 
half MIPS and burst rates of 20 MIPS, speeds comparable 
to that of super minicomputers. 
The high 
instruction execution rates are made possible through a innovative design that incorporates 
an on- 
chip instruction cache with burst-transfer 
capability. 


This section illustrates 
the flexibility 
and power of the 80960KB 
system architecture 
using the 
advanced 
32-bit 80960KB 
processor. The section examines 
system configurations 
from general 
perspective 
to explain the design concepts. Subsequent 
sections describe the the system design. 


The central processing module, memory module, and I/O module form the natural boundaries for the 
hardware system architecture. 
The modules are connected together by the high bandwidth 
32-bit 
multiplexed L-bus, which can transfer data at a maximum sustained rate of 53M bytes per second for 
an 80960KB processor operating at 20 MHz. 
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Figure 3 shows a simplified block diagram of a possible system configuration. 
The heart of this 
system is the 80960KB processor, which fetches program instructions, executes code, manipulates 
stored information, and interacts with I/O devices. The high bandwidth L-bus connects the 80960KB 
processor to memory and I/O modules. The 80960KB processor stores system data and instructions 
and programs in the memory module. By accessing various peripheral devices in the I/O module, the 
80960KB processor supports terminals, modems, printers, disks, and other I/O devices. 


The 80960KB processor performs bus operations 
using multiplexed 
address and data signals and 
provides all the necessary control signals. For example, standard control signals, such as Address 
Latch Enable (ALE), Address/Data 
Status (ADS), Write/Read 
command 
(W(R) Data Transmit/ 
Receive (DT(R) and Data Enable (DEN) are provided by the 80960KB processor. The 80960KB 
processor also generates byte enable signals that specify which bytes on the 32-bit data lines are valid 
for the transfer. 


The L-bus supports burst transactions, 
which access up to four data words at a maximum rate of one 
word per clock cycle. The 80960KB processor uses the two low-order address lines to indicate how 
many words are to be transferred. The 80960KB processor performs burst transactions 
to load the 
on-chip 512-byte 
instruction 
cache to minimize 
memory accesses for instruction 
fetches. Burst 
transactions 
can also be used for data accesses. 


To transfer control of the bus to an external bus master, the 80960KB 
processor 
provides 
two 
arbitration signals: hold request (HOLD) and hold acknowledge 
(HLDA). After receiving HOLD, 
the processor grants control of the bus to an external bus master by asserting HLDA. 


The 80960KB 
processor 
provides 
a flexible 
interrupt 
structure 
by using an on-chip 
interrupt 
controller, an external interrupt controller, or both. The type of interrupt structure is specified by an 
internal interrupt vector register. For a system with multiple processors, another method is available, 
called inter-agent 
communication 
(lAC) where a processor 
can interrupt 
another processor 
by 
sending an lAC message. 


A memory 
module 
can consist of the memory 
controller, 
Erasable 
Programmable 
Read Only 
Memory (EPROM), and static or dynamic Random Access Memory (RAM). The memory controller 
first conditions the L-bus signals for memory operation. It demultiplexes 
the address and data lines, 
generates 
the chip select signals from the address, detects the start of the cycle for burst mode 
operation, and latches the byte enable signals. 


The memory controller generates the control signals for EPROM, SRAM, and DRAM. In particular, 
it provides the control signals, multiplexed 
row/column 
address, and refresh control for dynamic 
RAMs. The controller 
can be designed 
to accommodate 
the burst transaction 
of the 80960KB 
processor by using the static column mode or nibble mode features of the dynamic RAM. In addition 
to supplying the operation signals, the controller generates the READY signal to indicate that data 
can be transferred to or from the 80960KB processor. 


The 80960KB processor directly addresses up to 4G bytes of physical memory. The processor does 
not allow burst accesses to cross a 16-byte boundary to ease the design of the controller. Each address 
specifies a four-byte data word within the block. Individual data bytes can be accessed by using the 
four byte enable signals from the 80960KB processor. 


The I/O module consists of the I/O components and the interface circuit. I/O components can be used 
to allow the 80960KB 
processor 
to use most of its clock cycles for computational 
and system 
management 
activities. Time consuming 
tasks can be off-loaded to specialized 
slave-type compo- 
nents, 
such as the 8259A Programmable 
Interrupt 
Controller, 
or the 82530 
Communication 
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Controller. 
Some tasks may require a) master-type 
component, 
such as the 82586 Local Area 
Network Control. 


The interface 
circuit performs 
several functions. 
It demultiplexes 
the address 
and data lines, 
generates the chip select signals from the address, produces the I/O read or I/O write command from 
the processor's 
w/R signal, latches the byte enable signals, and generates 
the READY 
signal. 


Because these functions are the same as some of the functions of the memory controller, the same 
logic can be used for both interfaces. For master-type peripherals that operate on a 16-bit data bus, 
the interface circuit translates the 32-bit data bus to a 16-bit data bus. 


The 80960KB processor uses memory-mapped 
addresses to access I/O devices. This allows the CPU 
to use many of the same instructions 
to exchange 
information 
for both memory 
and peripheral 
devices. Thus, the powerful memory-type 
instructions can be used to perform 8-, 16-, and 32-bit data 
transfers. 


Section 5 describes 
design guidelines 
for the I/O interface by examining 
representative 
design 
examples. 


The basic hardware system configuration 
is modular and flexible. The processor, memory, and I/O 
modules form the natural boundaries in the basic hardware system architecture. The high-bandwidth 
L-bus that supports burst transfers is used for the data path between the 80960KB processor and other 
modules. 


The 32-bit multiplexed 
local bus (L-bus) connects the 80960KB processor to memory and I/O and 
forms the backbone of any 80960KB processor based system. This high bandwidth 
bus provides 
burst-transfer 
capability allowing up to four successive 32-bit data word transfers at a maximum rate 
of one word every clock cycle. In addition to the L-bus signals, the 80960KB processor uses other 
signals to communicate 
to other bus masters. This section, which describes these signals and the 
associated operations, follows the outline shown below: 


L-bus states and their relationship 
to each other 


L-bus signal groups, which consist of address/data 
and control 


L-bus read, write, and burst transactions 


L-bus timing analyses and timing circuit generation 


Related L-bus operations such as arbitration, interrupt, and reset operations 


The L-bus forms the data communication 
path between the various components in a basic 80960KB 
hardware 
system. The 80960KB processor 
utilizes the L-bus to fetch instructions, 
to manipulate 
information 
from both memory and I/O devices, and to respond to interrupts. To perform these 
functions at a high data rate, the 80960KB processor provides a burst mode, which transfers up to 
four data words at a maximum rate of one 32-bit word per clock cycle. The 80960KB L-bus has the 
following features: 


32-bit multiplexed 
address/data 
path 


High data bandwidth relative to the speed selection of the 80960KB processor 


Four byte enables and a four-word burst capability 
that allow transfers from I to 16 bytes in 
length 


Support for TTL latches and buffers. 


The L-bus has five basic bus states: idle (T), address (Ta>,data (Td), recovery (Tr), and wait (Tw). 
During system operation, the 80960KB processor continuously 
enters and exits different bus states 
as shown in Figure 4. This state diagram assumes that only one bus master resides on the L-bus. 


The processor 
occupies the Tj state when no address/data 
transfers are in progress. When a new 
request is received, the 80960KB processor enters the Ta state to transmit the address. 


Following 
a Ta state, the 80960KB processor 
enters a Td state to transmit or receive data on the 
address/data 
lines provided that the data is (indicated by the assertion of READY at the input of the 
processor). If the data is not ready, the processor enters aT w state and remains in this state until data 
is ready. 


Tw states may be repeated as many times as necessary to allow sufficient time for the memory or 
I/O device to respond. 


After a data word is transferred, 
the 80960KB processor exits the Td or Tw state for a single word 
transfer or enters the Td state again to transfer another data word for a burst transaction. 
If the next 
data word is not ready during the next clock cycle for a burst transaction, the processor enters the Tw 
state again. 


When the 80960KB processor completes the data transfer of all the data words (one or up to four), 
it enters the recovery (Tr) state to allow sufficient time for devices (such as memories) on the bus to 
recover. The processor returns to the Tj state if no new request is pending, or enters the Tj state if a 
new request is pending. 


T, - 
IDLE STATE 
T.- 
ADDRESS 
STATE 
T.- DATA STATE 
T, - 
RECOVERY 
STATE 
T.- 
WAIT STATE 


READY 
- 
READY ASSERTED 
NOT READY - 
READY NOT ASSERTED 
BURST 
- 
MULTIPLE 
WORD ACCESS 
IN PROGRESS 


NO BURST 
- 
MULTIPLE 
WORD ACCESS 
DONE, OR A 
ONE-WORD 
ACCESS 


The L-bus states are used to define some of the L-bus signals. As shown in Figure 5, the signals on 
the L-bus consist of two basic groups: address/data, 
and control. 
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CONTROL (12 LINES) 


The address/data 
signal group consists of 32 bidirectional 
lines. These signals are multiplexed 
and 


serve a dual purpose depending upon the bus state. 


Local AddressIData31 
through Local AddressIData2 
represent the address 


signals on the L-bus during the Ta state. LAD2 is the least significant bit, and 
LAD3\ 
is the most significant address bit. LAD3l through LAD2 contain a 


physical word address. 


LAD \ and LADo specify the number of data words to transfer for a burst 
transaction. The address/data 
signals float to a high impedance state when 


not activated. 


SIZE (LADj-LADo) 
The SIZE 
signal indicates 
whether 
one, two, three, or four words are 


transferred 
during the current transaction. 
During a Ta state, LAD\ and 


LADo represent the word size signals. The encoding is shown in Table 1. 


Word Selection 
LAD, 
LADo 


1 Word 
Low 
Low 


2 Words 
Low 
High 


3 Words 
High 
Low 


4 Words 
High 
High 
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Local 
Address/Data3l 
through Local Address/Datao 
represent 
the data 
signal on the L-bus during the Td and T w states. LADo is the least significant 
and LAD3, is the most significant address bit. The address/data signals float 
to a high impedance state when not activated. 


The control signal group consists of 12 signals that permit the transfer of data. These signals can be 
used to control data buffers, address latches, and other standard interface logic. 


The Address 
Latch Enable is an active low signal that can be used to latch 


the address from the 80960KB processor. ALE is asserted during the T state 
and deasserted before the beginning of the Td state. ALE floats to ; high 
impedance level when the processor is not operating on the bus (i.e., it is in 
the idle state), or is at the end of any bus access. 


Address/Data 
Status is an active low signal that is driven by the 80960KB 
processor to indicate an address state. ADS is asserted during every Ta state 
and deasserted during the following Td and T w states. For a burst transaction, 
ADS is asserted again every Td (and Tw) state where READY was asserted 
in the prior cycle. The signal is an open drain output. 


Data Transmit/Receive 
indicates the direction of data flow to or from the 


L-bus. For a read operation or an interrupt acknowledgement, 
DT/R is low 


during the Ta, T w' and Td states to ind~ate that data flows into the 80960KB 
processor. For a write operation, DTjR is high during the Ta, Tw' and Td states 
to indicate 
that data flows from the 80960KB 
processor. 
DT/R never 


changes states when DEN is asserted. The DT/R line is an open drain output 
of the 80960KB processor. 


Data Enable 
is an active-low signal that can be used to enable data trans- 
ceivers. DEN is asserted during all Td and Tw states. The DEN line is an open 
drain output of the 80960KB processor. 


The Write/Read 
signal instructs memory or I/O device to write or read data 


on the L-bus. The 80960KB processor asserts W/R during a T state. The 
signal remains valid during subsequent 
Td and Tw states. W/Rais an open 
drain output of the 80960KB processor. 


The Byte Enable 
output signals of the 80960KB processor specify which 
bytes (up to four) on the 32-bit data bus are transferred during the transac- 
tion. Table 2 shows the decoding scheme. 


The byte enable signals are valid from the 80960KB processor before data 
is transferred, as shown in Figure 6 (assumes no wait states). The byte enable 
signals that are valid for the first data word are specified during the Ta state. 
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For a four-word burst transaction, 
the byte enable signals that are valid for 
the second word are asserted during the first data state (TdO) for the third word 
during the second data state (Tdl) and for the fourth word during the third 
data state (TdZ)' 
The byte enable signals are undefined during the last data 
state (Td3) ofthe last word transferred. 


Byte Enable Signal 
Address Line Selection 


BE" 
LAD7-LADo 


BE, 
LAD15-LAD. 


BE2 
LAD2a-LAD,. 


BEa 
LADa,-LAD2• 


Although not shown in the diagram, the byte enable signals of each word are 
latched internally by the 80960KB processor and remain valid during every 
data or wait state until READY is applied. After READY is applied the byte 
enable signals change during the next Td state or become undefined for the 
last data transfer. 


The 80960KB processor asserts only adjacent byte enables. For example, 
the 80960KB processor does not perform a bus operation with only BEo and 
BEz active, 
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READY signal indicates that the data on the L-bus can be sampled (read) 
or removed (write) by the 80960KB processor. If READY is not asserted 
following 
Ta state or in between 
Td states, a Tw state is generated. 
The 
READY is an active-low input signal to the 80960KB processor. 


Bus Lock prevents other bus masters from gaining control of the L-bus 
during 
a bus operation. 
It is activated 
by certain 
80960KB 
processor 
operations and instructions. 


The 80960KB 
processor 
uses the bus LOCK signal when it performs 
a 
RMW memory 
operation. 
When the processor 
performs 
a RMW-Read 
operation, it asserts the LOCK signal during the Ta state and holds LOCK 
asserted. If the signal was already asserted, the processor 
waits until this 
signal is deasserted 
before performing 
the RMW -Read operation. 
The 
processor deasserts the LOCK signal during the Ta state when it performs 
a RMW-Write 
operation. 


The 80960KB 
processor 
asserts the LOCK 
signal during 
the interrupt 
acknowledge 
sequence. LOCK is an input and an open drain output. 


The Cacheable 
signal specifies 
whether 
the data is cacheable. 
If the 
80960KB processor 
asserts CACHE during the Ta state, then the data is 
cacheable. The CACHE signal is undefined during the Td and Tw states. The 
CACHE 
signal floats to a high impedance 
state when the L-bus is not 
acquired. 


Signal 
Signal 
Signal Function 
Active 
Direction 
Type of 
Group 
Symbol 
State 
Output 


Local 
Address 
Address! 
(LAD,,-LAD2) 
32-bit address 
T. 
0 
3-state 
Data 


Data 
32-bit data 
Td•Tw 
I/O 
3-state 
(LAD,,-LADo) 


Size 
Specifies number of 
T. 
0 
3-state 
(LAD,-LADo) 
words to transfer 


Control 
ALE 
Enables address 
T. 
0 
3-state 
latch 


ADS 
Identifies an address 
T•• Td.Tw 
0 
Open drain 
state 


DT/R 
Controls direction of 
T•• Td•Tw 
0 
Open drain 
data flow 


DEN 
Enables data 
Td.Tw 
0 
Open drain 
transceiver/latch 


W/R 
Read/write command 
T•• Td.Tw 
0 
Open drain 


BE,-BEo 
Specifies which data 
T•. To'. Tw2 
0 
Open drain 
bytes to transfer 


READY 
Indicates data is 
Td.Tw 
I 
ready to transfer 
- 


LOCK 
Locks bus 
Any 
I/O 
Open drain 


Cache 
Indicates cacheable 
T. 
0 
3-state 
transaction 


Note: 
1 except first Td• Tw 
2 except last Td• Tw 


Additional pins are used by the 80960KB processor to control the execution of instructions 
and to 
interface to other bus masters. These pins include the arbitration, interrupt, error, and reset signals. 
Each of these signal groups are explained in separate sections. 


The 80960KB processor 
uses the L-bus signals to perform transactions, 
which are simply L-bus 
operations where data is transferred to (or from) the CPU from (or to) another component. 
During 
a transaction, 
the 80960KB processor can transfer up to four words of data for a single address to 
enhance system throughput. This is especially 
useful when loading cache memory. 


The 80960KB hardware system typically uses two clock signals, CLK2 and CLK, to synchronize the 
transitions between L-bus states. CLK2 is the clock input to the 80960KB and is double the specified 
processor frequency. CLK is the clock input signal to the peripheral devices, and it is the operating 
frequency ofthe 80960KB processor. Figure 7 shows the relationship between the system CLK2 and 
CLK. 


BUS 
BUS=:j BUS:j 
- 
STATE --- 
STATE 
- 
STATE 


4---T.~.. 
To 
....--T, 


The basic transaction reads or writes one data word. Figure 8 shows a typical timing diagram for a 
basic read transaction (for exact timings, see the 80960KB processor data sheet). A read transaction 
may be preceded and succeeded by any type of bus transaction. The following sequence of events 
explains the flow of the timing diagram. For simplicity, no wait states are shown. 


1. 
The 80960KB processor generates several signals during the Ta state. 


It transmits the address on the address/data 
lines. LAD) and LADo specifiy a single word 
transaction. 


It asserts ALE. An ALE signal can be used to latch the address. 


It asserts ADS. 


It asserts BE3-BEo to specify which bytes are used when reading the data word. 


It brings W/R low to denote a read operation. 


It brings DT/R signal low. DT/R can be used for the direction input to data transceivers. 


T. 
T. 
T, 


CLK2 


CLK 


LAD,,· 
LADo 


ALE 


ADS 


BE,·BEo 


W/R 


DTiR 


DEN 


READY 


2. 
During the Td state, several actions occur. 


The 80960KB processor reads the data on the address/data 
lines. 


The 80960KB 
processor 
asserts DEN. DEN can be used to enable data transceivers. 


READY 
is asserted by external timing logic and data is transmitted 
from the storage 


devices. If READY is not asserted, the data transfer is delayed generating aT w state. The 
Tw state is repeated, until READY is asserted. 


3. 
The Tr state follows the data state. This allows the system components 
adequate time (one 


processor 
clock cycle) to remove their outputs from the bus before the 80960KB processor 


generates the next address on the address/data 
lines. During the Tr state wlR, DTIR,and DEN 
become inactive. 
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Figure 9 shows a typical timing diagram for a basic write transaction with one wait state. Like the 
read transaction, 
a write operation may be preceded and succeeded by any type of bus transaction. 


The following sequence of events explains the flow of the timing diagram. 


1. 
Similar to the read transaction, 
the 80960KB processor generates several signals during the Ta 


state. 


It transmits the address on the address/data 
lines. LAD, and LADo specify a single word 
transaction. 


It asserts ALE. An ALE signal can be used to latch the address. 


It asserts ADS. 


It asserts BE3-BEo to specify which bytes are used when writing the data word. 


It brings WIR high to denote a write operation. 


It brings DTIR signal high. DTIR can be used for the direction input to data transceivers. 
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During the Td state, several actions occur. 


The 80960KB places the data on the address/data 
lines. 


The 80960KB processor asserts DEN. DEN can be used to enable data transceivers. 


READY is not asserted by external timing logic. Consequently, 
data is held on the LAD 
lines. 


During the T w state READY is asserted and the data is written to the storage device. Note that 
the w/R, DT/R and DEN remain constant until the bus state after READY is asserted. 


The 80960KB processor supports burst transactions that read or write up to four words at a maximum 
rate of one word every processor clock cycle. Burst transactions 
are always contained within a 16- 
byte boundary. If a transaction crosses a 16-byte boundary, the 80960KB processor automatically 
splits the transaction 
into two accesses. 


The byte enable signals are valid for each word to allow partial-word write operations for a burst write 
transaction. The CACHE output signal during a Ta state applies to all words of a burst transaction. 


A burst read or write transaction is similar to a basic read or write operation. It differs primarily in 
the number of data words transferred: the basic transaction always transfers one data word, the burst 
transaction transfers up to four data words. For a burst transaction, the byte enable signals are applied 
during the Ta state, and subsequently 
during every Td or Tw state before the data word is transferred. 
Figure 10 shows the timing for a three-word 
burst read transaction 
without wait states. Figure 11 
shows the timing for a two-word burst write transaction with a wait state occurring during the transfer 
of the first word. Note that the byte enable signals remain constant until the data state after READY 
is asserted. 


In an 80960KB processor-based 
system, timing signals must be generated for the clock and reset 
inputs. To generate these signals, discrete logic should be utilized to minimize skew and maintain the 
rise and fall times as short as possible. This section describes a typical circuit that synthesizes 
the 
clock signal. The RESET timing generation is discussed in the "RESET and Initialization" 
section.) 


In order to design a clock generator, the clock input specifications 
to the 80960KB processor 
are 
examined first. The clock (CLK2) waveform is shown in Figure 12. The clock pulse is specified by 
five parameters 
listed below: 
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The clock fall time (tf) 


The clock low time (tl) 


The clock rise time (t) 


The clock period (tey) 


T. 
T, 
T, 
T, 
T, 


CLK2 


CLK 


LAD,,- 
""",.,........ 


LAD. 


ALE 


ADS 


VALID 
VALID 


The time required to go from 90% of the difference between the high and low voltage levels to (10% 
of the difference 
(or from low to high) is defined as the clock fall (rise) time. The clock low time 
specifies the time required for the clock to remain within 10% of the low voltage level. Similarly, the 
clock high time specifies the required time for the clock pulse to remain within 10% of the high 
voltage level. The clock period is the sum of tf + tl + t, + tho 


The clock generator must have fast enough rise and fall times to comply with the requirements 
for 


high and low time and the overall clock period. For example, consider a clock pulse with a 50% duty 
cycle at 40 MHz. The clock period is specified at minimum of 25 ns, low time at minimum of 8 ns, 
and high time at minimum of 8 ns. This implies that the sum of the rise and fall time must not be greater 
than 9 ns. Thus, the clock generator should be designed to have rise and fall times not greater than 
4.5 ns each. 
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Ta 
Td 


ClK2 


ClK 


LAD31- 


LADO 


ALE 


ADS 


BE3-BEO 


wifJ. 


DT/R 


DEN 


READY 


Besides specifying a maximum clock rate, the 80960KB processor requires a minimum CLK2 rate 
of 8 MHz to maintain 
the state of the internal dynamic cells. Due to this minimum 
frequency 
requirement, 
the 80960KB processor cannot be single-stepped 
by disabling the clock. 
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FIgure 13 shows an example of a clock generator that produces two clock pulses, one double the 
frequency of the other with the skew between the pulses in the range of 1to 3 ns. This particular circuit 
produces a 40-MHz clock at 50% duty cycle with rise and fall times of less than 4 ns. The circuit 
design consists of four devices: an oscillator, 
a pulse shaping network, a synchronous 
up/down 
counter, and a NAND gate driver. The output of the 80-MHz hybrid clock oscillator connects to the 
pulse shaping network (two NAND gates in series) which in turn feeds into the clock input of the up/ 
down counter. This counter produces a 40-MHz CLK2 output signal and a 20-MHz CLK output 
signal. Because the outputs of the counter are synchronous, 
the skew between CLK2 and CLK is 
typically less than 2 ns. To provide adequate signal margin and maintain fast rise and fall times, the 
two clock signals are conditioned 
by the NAND gate driver. The timing waveforms 
of the clock 
circuit are shown in Figure 14. 
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The hybrid clock oscillator typically requires 5 ms to stabilize after power is applied. The 80960KB 
processor cannot begin to execute instructions until after the clock and VCC have reached their DC 
andAC specifications. 
The RESET signal can be used to control the start ofthe CPU execution when 
power is applied. This is discussed in the "RESET and Initialization" 
section. 


When multiple bus masters exist, an arbitration protocol is used to exchange control of the bus. The 
protocol assumes that there are two bus masters: one that controls the bus by default, and the other 
that requests control of the bus when it performs an operation, such as a DMA controller. More than 
two bus masters may exist on the L-bus, but this requires external arbitration logic. There should be 
no more than two 80960KB processors, 
however, on an L-bus. 


Assuming that there are only two bus masters, this section examines the bus arbitration, bus states, 
and timing diagrams for different combinations 
of bus masters, as shown in Table 4. 


Bus Master Combination 


Bus Master that Controls the Bus 
Bus Master that Requests 
by Default 
Contol 
of the Bus 


CASE 1 
80960KB 
PROCESSOR 
I/O DEVICE 


CASE2 
80960KB 
PROCESSOR 
80960KB 
PROCESSOR 


CASE 3 
I/O DEVICE 
B0960KB PROCESSOR 


For the first case, the 80960KB processor controls the L-bus, and a master I/O peripheral, suchas a 
DMA controller, requests control of the bus for operations. 
The 80960KB processor 
and the I/O 
peripheral 
exchange 
control of the bus with two signals: 
the hold request 
(HOLD) 
and hold 
acknowledge 
(HLDA) signals. 


HOLD is an input signal of the 80960KB processor, which indicates that the master I/O peripheral 
is requesting 
control of the L-bus. When HOLD is asserted, the 80960KB processor 
surrenders 
control of the bus after it completes the current bus transaction. The processor acknowledges 
transfer 
of control of the L-bus to the other bus master by asserting the HLDA. 


Figure 15 shows the state diagram for a L-bus with an I/O peripheral bus master. This state diagram 
consists of the hold state (Th) addition to the five basic states described in the "Basic L-Bus State" 


section. The 80960KB processor enters the Th state when it surrenders the control of the bus. It can 
enter the Th state from the T, or T,state. When the 80960KB processor regains control of the L-bus, 
it enters the Ta state if a new request is pending or a Tj state if no new request is pending. 


~ 
• REQUEST 
PENDING 
NO HOLD 


READY 
• NO BURST 
• HOLD 


READY 
oNO BURST 
oNO HOLD 


T, - 
IDLE STATE 
T.- 
ADDRESS 
STATE 
T.- DATA STATE 
T, - 
RECOVERY 
STATE 
T.- 
WAIT STATE 
T.- HOLD STATE 


READY 
- 
READY ASSERTED 
NOT READY - 
READY NOT ASSERTED 
BURST 
- 
MULTIPLE 
WORD ACCESS 
IN PROGRESS 
NO BURST 
- 
MULTIPLE 
WORD ACCESS 
DONE, OR A 
ONE-WORD 
ACCESS 
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Figure 16shows the arbitration timing diagram. The "T" state represents the last cycle of a transaction 
in which the READY signal was asserted or a Tj state. The 80960KB processor receives a request 
to relinquish control of the bus when HOLD is asserted. After the 80960KB processor completes the 
current 
transaction, 
it responds 
to this request 
by floating 
the three-state 
output 
signals 
and 
deasserting the open drain output signals. The HLDA output signal, however, remains active and is 
asserted as the 80960KB processor enters a Th state. During the Th state, the CPU ignores all input 
signals except HOLD and RESET. When the HOLD input signal is deasserted, 
the 80960KB 
processor exits the Th state and deasserts HLDA. 


DELAY 
HLDAR 
HOLD 
(HOLDA) 


PRIMARY BUS 
SECONDARY 
BUS 
MASTER 
MASTER 


HLDA 
DELAY 
HOLDR 
(HOLD) 
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For the next case, two 80960KB 
processors 
reside on the L-bus. During initialization, 
one is 


designated as the Primary Bus Master (PBM), the other as the Secondary Bus Master (SBM). 


The exchange protocol that is used guarantees that neither device is kept off the bus indefinitely. The 
80960KB processors use two pins for bus arbitration: the HOLD input pin, and the HLDA output 
pins. These input and output pins for the SBM are interpreted differently, however. 


When the SBM is initialized, the pin normally used for HOLD input signal is interpreted as the hold 
acknowledge 
request (HLDAR) input signal. The assertion of 45 HLDAR indicates that the PBM 
relinquished 
control of the L-bus. Similarly, the HLDA output signal of the SBM is interpreted 
as 


the hold request (HOLDR) output signal. The SBM asserts HOLDR to request acquisition of the L- 
bus. Thus, bus arbitration between two 80960KB processors 
can be accomplished 
by connecting 


HOLD of the PBM to HOLDR of the SBM, and HLDA of the PBM to the HLDAR of the SBM, as 
shown in Figure 17. 


When using the connection shown in Figure 17, a delay must be inserted between the input and output 
signals because the minimum clock-to-output 
delay is less than the maximum hold time of the input 


signals. The delay time must be greater than 5 ns, but less than the clock period minus the setup time 
minus the maximum clock-to-output 
delay (5ns ~ Delay ~ TperiodTselup-TClOCk.TO.OulPu,). 


The state diagram for the SBM is shown in Figure 18. Because there are two 80960KB processors, 
the LOCK signal is included in the state diagram. The SBM requests control of the L-bus by asserting 
HOLDR and subsequently 
enters the hold request (Th,) state provided that the bus is not locked 


(locked means that LOCK is asserted by the PBM and the SBM has a RMW operation pending). The 
SBM remains in the Thrstate until it acquires control of the L-bus by receiving HLDAR. The SBM 
returns to the Tj state by deasserting 
HOLDR provided that the following two conditions exist: 


A RMW operation is pending 


The PBM asserted LOCK while the SBM was in the Thrstate. 


The SBM gains control of the bus when HLDAR is asserted provided that the bus is not locked. After 
gaining control of the L-bus, the SBM performs the operations, and enters aT w state if necessary. At 
the end of a transaction, the SBM goes to the Trstate and deasserts HOLDR for at least one processor 
clock cycle to allow another peripheral 
bus master to gain access if needed. If another request is 


pending, the SBM enters the Thrstate and asserts HOLDR provided the bus is not locked. The PBM 
never forces the SBM off the bus. 


Figure 19 shows the timing diagram for acquiring and relinquishing the L-bus by an SBM. The SBM 
enters into the Hold Request (Th,)state and asserts the HOLDR) signal. It remains in the Thrstate until 
HLDAR is asserted, which indicates that the SBM can utilize the L-bus during the next state. When 


the bus is no longerrequired, 
HOLDR is deasserted during the state following the last READY signal. 


Except for HOLDR, the output signals of the SBM go into a high impedance state or are deasserted 
for the case of open-drain outputs. 


NO 
REQUEST 
+ LOCKED 


T, - 
IDLE STATE 
T.- 
ADDRESS 
STATE 
T.- 
DATA STATE 
T, - 
RECOVERY 
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Figure 20 shows an example of bus arbitration between a PBM and an SBM using the arbitration 
signals. Each bus master performs a one-word read and a two-word write transaction to demonstrate 
the fastest possible bus exchanges. 


While the PBM is performing a read transaction, the SBM requests control of the L-bus by asserting 
HOLDR and entering the Thr state. It remains in this state until the PBM grants the request by asserting 
HLDA after the read transaction is completed. After granting the request,the PBM enters the Th state 
and remains in this state until its HOLD signal is deasserted. 
When the SBM completes 
the read 
transaction, 
it deasserts HOLDR and gives control back to the PBM. 


The PBM now performs a two word write transaction after deasserting the HLDA. The SBM requests 
control of the bus again by asserting the HOLDR signal and enters the Thr state. When the PBM 
completes the two-word write transaction, it grants the request by asserting HLDAand 
enters the Th 
state. The SBM receives the signal on the HLDAR input and performs a two-word write transaction. 
When the SBM completes the transaction, the control ofthe L-bus is transferred to the PBM, and both 
the PBM and the SBM enter the T state. 
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Another case exists where a peripheral 
device controls 
the L-bus, and the 80960KB 
processor 
requests control of the bus to perform operations. This alternative is not advisable because it hinders 
system performance. The exchange protocol is identical to the one described in the previous section. 
The 80960KB processor is an SBM and uses two pins for bus arbitration: the HOLDR input pin and 
the HLDAR 
output pin. The state diagram is similar to the one shown in Figure 
18. The lock 
conditions are not used for this case, however. 


The peripheral device grants control of the L-bus bus asserting HLDAR when the SBM requests use 
of the L-bus. The peripheral device can obtain control of the L-bus again by deasserting 
HLDAR. 


If this occurs, the 80960KB processor surrenders control of the bus after it completes the current 
transaction, as shown in Figure 21. At that time, the 80960KB processor deasserts the HOLDR signal 
and places the other output signals into a high impedance state or a deasserted open drain level. The 
80960KB processor may request access to the L-bus by asserting HOLDR again. 


The lAC mechanism gives 80960KB processors the capability to send and receive messages to one 
another and to other bus agents. The lAC mechanism is essentially a non-maskable 
interrupt with pre- 
defined service routines. These routines are implemented 
in the 80960KB processor and are used to 


perform control functions 
such as purging the instruction 
cache, setting breakpoint 
registers, 
or 
stopping and starting the processor. By using lAC messages, external agents can remotely control 
the 80960KB. This allows easy integration of the 80960KB into system environments. 


lAC messages can also be used to generate interrupts that behave exactly the same as hardwired 
interrupts. Since the interrupt vector is encoded in the lAC message, any of the possible interrupt 
service routines can be invoked. 


Figure 22 shows a typical example of an lAC operation. In this case, an external processor 
gains 
control of the 80960KB by using an lAC operation. The external processor performs two functions: 
it writes the message in a buffer, called the message buffer; and it asserts the lAC pin of the 80960KB 
processor. Upon receipt ofthe lAC signal, the 80960KB processor stops executing its current process 
and performs 
a four-word 
read of the message buffer. After completing 
the read operation, 
the 
80960KB processor automatically 
performs a one-word write operation to a pre-defined 
address to 
acknowledge 
the receipt of the message. The 80960KB processor 
then proceeds 
to perform the 
required action. 


The lAC messages 
are 
specifically 
defined 
and behave much like machine 
instructions. 
The 
80960KB 
processor 
reserves the upper 16M bytes (FFOOOOOOHto FFFFFFFFH) 
of the 4M-byte 
address range for lAC message operations. 


There are two types of lAC messages: internal and external. Internal lAC messages allow a program 
to send a command 
to its own processor. An internal lAC message is sent by writing 
to address 
FFOOOOIOwInternal lAC messages cause no L-bus activity. 


External lAC messages can be used to send a command to another processor on the L-bus or to a 
remote processor. A processor sends an external lAC message by writing to a buffer area and causing 
the lAC pin of the receiving 80960KB to be asserted. 


When the lAC pin is asserted, the recipient processor reads the reserved address to fetch the data from 
its lAC message buffer. After reading the lAC message buffer, the recipient does a write operation 
to another reserved address to acknowledge 
receipt of the lAC message. The lAC pin is deasserted 
as a result of this write operation, and the processor is ready to receive 
another lAC. 


To use the external lAC feature of the 80960KB, 
the following 
items 
are needed: a four-word 
message buffer RAM mapped to a reserved address to store the message, logic to assert the lAC pin 
of the 80960KB, and decoding logic to deassert the lAC pin on command from the 80960KB. 


Each 80960KB processor that receives an lAC message must have four 32-bit words of message 
buffer. This buffer can use special hardware or a reserved area in RAM. For proper operation of the 


buffer, two requirements 
must be met: the receiving 80960KB must be able to read this buffer at 
FFOOOOlOHif the receiving 
80960KB's 
Local Processor 
Number (LPN) is equal to zero (see the 
"RESET and Initialization" 
section for details of the LPN), or at FF000030H if the LPN is equal to 
one; and the sending processor must be able to write this buffer. 


When the lAC message buffer receives a message, logic asserts the lAC pin and keeps it asserted. 
After the 80960KB 
processor 
reads the lAC message, 
it performs 
a one-word 
write to address 
FFOOOOOOif its LPN is zero, or FF000020 
if its LPN is one. 
This reserved address serves two 
functions: it causes external 
logic to deassert the lAC pin, and it maps to a register that contains the 
current processor priority. If the low order three bits of the data word have a value of 100B (see Figure 
23), the external logic should deassert the lAC pin on completion 
of the write operation. 
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The 80960KB contains an internal register that keeps track of the current priority (a value between 
o and 31) at which it is executing. 
This priority is used to decide whether or not to service interrupts 
- 
higher priority interrupts 
are serviced, others are posted for later servicing. 
In some system 
designs it may be desirable to have this priority visible outside of the processor. To allow this, the 
80960KB provides support for an external priority register. Whenever 
the priority of the 80960KB 
changes, the contents of this register are automatically 
updated. 


This feature may be enabled in two steps. If the Write External Priority bit is set in the PRCB 
(see 
the 80960KB CPU Programmer's 
Reference Manual), then the external priority register is updated 
as a result of a MODPC instruction or whenever an interrupt occurs. If external 
lAC messages are 
enabled, then external priority is also updated whenever a result of an lAC is to change processor 
priority. 


The 80960KB expects to write its priority into a 5-bit register 
mapped to address FFOOOOOOif its 
LPN is zero, or FF000020 if its LPN is one. To set the priority, the processor performs a one-word 
write operation 
in the form shown in Figure 23. The priority is contained in bit2o-bitI6, 
and bit) is 
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asserted to indicate that the priority is changed. It is necessary to use bit} as a qualifier to distinguish 
priority 
write operations 
from lAC message 
acknowledgments, 
which use the same reserved 
address. 


The external priority register can be used to filter lAC messages. Since the processor always services 
the lAC pin (i.e., it is non-maskable), 
a low priority lAC message can interrupt a high lAC priority 


task. To prevent this, a system can associate a priority with each lAC message. This priority can then 
be compared to the priority stored in the external priority register and used to decide whether or not 
to accept the lAC message. One way to associate a priority with an lAC message is to encode the 
message priority into the lAC message destination 
address as shown 
in Figure 24. The range of 


reserved addresses shown in Figure 24 have been set aside for this purpose. 


i 
~PRIORITY 
l--------------ADDRESS 
OFRECEPIENT 


270647-36 


The 80960KB processor 
responds to external events occurring at arbitrary times by means of an 


interrupt 
signal. Various sources, 
which include 
hardware 
components 
and special 
-software 


instructions, 
generate an interrupt signal that can suspend execution 
of the 80960KB processor's 


current instruction 
stream. The hardware-generated 
interrupts 
are discussed 
in this section. For 


complete information on software-generated 
interrupts, see the Programmer's 
Reference Chapter of 


this handbook. 


The 80960KB is unusual in that the interrupt controller automatically 
does the processor housekeep- 


ing tasks that are normally left for the programmer to deal with in the interrupt handling routine. The 
local registers are pushed onto the stack, state is saved, arithmetic controls are saved, priority of the 
processor 
is changed to the interrupt priority, and stack pointers 
are managed. 
All this is done 


automatically 
before entering the user written interrupt routine. 
The bottom line of this is that the 


programmer can simply worry about the function of the interrupt handling routine and not processor 
hou ekeeping, thus greatly simplifying 
the programming 
and debugging effort. 


The 80960KB processor 
provides a flexible interrupt structure. The 80960KB processor 
can 
be 


interrupted using any of three methods below: 
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Receipt of a signal on any or all of the four direct interrupt input signals (INTo' INTI' INT2, and 
INT) 


Receipt of a signal on the interrupt request (INTR) line to obtain an external interrupt vector 


Receipt of an lAC message from a processor program or external source. 


The choice of the method is determined 
by the setting 
in the on-chip Interrupt Control register. 


Interrupt signals can occur during any bus state regardless of which method is implemented. 


This section provides details on the 
multiplexed 
interrupt pins, the three interrupt methods, the 
Interrupt Control register, synchronization, 
and interrupt latency. 


The interrupt signals are multiplexed on four pins of the 80960KB processor: 
INTJIAC, INT I' INT / 
INTR, and INT/INTA. 
The on-chip Interrupt Control register determines how these pins are used 
(see "Interrupt Control Register" section). 


This pin multiplexes the Interrupto and Inter-Agent 
Communication 
re- 
quest input signals. The 80960KB processor interprets this input signal as 
either INTo or lAC. The lAC signal indicates a request for interrupt service 
when it is asserted. The lAC signal denotes that a message is waiting when 
it is asserted. 


The Interrupti input signal indicates a request for interrupt service when it 
is asserted. 


This pin multiplexes the Interrupt2 and Interrupt Request input signals. 
The 80960KB processor interprets this input signal as either INT2 or INTR. 
The INT2 signal indicates a request for interrupt service when it is asserted. 
The INTR signal indicates an interrupt request from an external interrupt 
controller. 
The 80960KB processor 
responds with an interrupt-acknowl- 
edge sequence. 
To ensure 
an interrupt, 
the INTR 
signal must remain 
asserted until the first cycle of the interrupt-acknowledge 
transaction. 


This pin multiplexes the Interrupt3 input signal and Interrupt Acknowl- 
edge output signal. The 80960KB processor uses this pin as the INT3 input 
signal or as the INT A output signal. The Interrupt Control register setting 
selects either the combination of INTR/INT A or INT 2/INT3. The INT 3 input 
signal indicates a request for interrupt service when it is asserted. INT A ac- 
knowledges the interrupt request from an external interrupt controller. The 
INT A signal is latched by the 80960KB processor and remains valid during 
the Td state. This signal is open drain output. 


The 80960KB processor uses a 32-bit, on-chip Interrupt Control register to define the function of the 
multiplexed 
interrupt pins. This 32-bit Interrupt Control register allocates eight bits for each of the 
four direct interrupt signals (INTo' INT l' INT2, and INT 3). The eight bits contain the vector number 
for each interrupt signal, as shown in Figure 25. The vector number is automatically 
read when one 
of the interrupt signals (INTo' INTI' INT2, and INT) 
is activated. 
For example, when an interrupt 
is signaled on INTo' the 80960KB processor 
uses bit7-bit9 of the Interrupt Control register as the 
vector number. 


liNT, 
liNT, 
liNT, 
I 
INTo 
I 
~VECTOR~VECTOR-.j"""VECTOR~VECTOR-+l 


The 80960KB processor uses the data field corresponding 
to INTo to determine identification 
of the 
INT JIAC input pin; a value ofOOH signifies the lAC function. If the data field corresponding 
to INT2 


has a value of 00H' the 80960KB processor interprets the INT/INTR 
pin as the INTR input signal, 
and the INT/INTA 
pin as the INTA output signal. In other words, this setting specifies that the 
80960KB 
processor 
should use these two pins for communication 
with an external 
interrupt 
controller. If the functions of INTR and INTA are selected, the direct interrupt pins (INTo and INT I) 
can still be used. 


The on-chip Interrupt Control register may be read and written by the Synchronous Load (synld) and 
Synchronous 
Move (synmov) instructions 
at the address FF000004H (see the 80960KB Program- 
mer's Reference Manual). The value of the data fields in the Interrupt Control register is FFOOOOOOH 
after initialization. 
This setting specifies that the four interrupt pins function as INTA, INTR, INTI 


and lAC. 


The 80960KB processor can be interrupted by asserting any or all of the four interrupt input signals 
(INTo' INTI' INT2, INT 3). If the signals are simultaneously 
asserted, the 80960KB assumes that INTo 
has the highest priority, followed by INTI' INT2, and INT3• Software should follow this convention 
when programming 
the Interrupt Control register. When the interrupt input signals are asserted, the 
80960KB processor utilizes a vector number specified by the Interrupt Control register as an index 
to an entry in the interrupt table located in memory. For complete software information on this topic, 
see the Programmer's 
Reference Chapter of this handbook. 
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The 80960KB processor can communicate 
with an external interrupt controller by performing 
an 
interrupt acknowledge 
sequence using the INTR and INTA signals. Figure 26 shows an example of 


the timing of an interrupt acknowledge 
sequence using the 8259A Programmable 
Interrupt Control- 
ler. 
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INTR is asserted by the 8259Aand remains asserted until the 80960KB processor activates the INTA 
signal for the first time. When the 80960KB 
processor 
receives 
an interrupt request, 
the CPU 


completes the current transaction 
(or comes to some interruptible 
point), and asserts INTA. INTA 
remains valid through the Ta, Td, and T" states. The first assertion of INTA triggers the 8259A to 
resolve priority among its interrupt requests. 


To compensate 
for the timing of the 8259A, the 80960KB processor automatically 
inserts five T 
states before asserting the INTA again to read the interrupt vector. Figure 26 shows READY asserted 
without a wait state during the first Interrupt Acknowledgement 
cycle and with one wait state during 


the second Interrupt Acknowledgement 
cycle. In practice, the 8259A would require about four wait 
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states in both cycles. The address during the Ta state for both interrupt acknowledge 
cycles is 


FFFFFFFCw 
For more details, see the "8259A 
Programmable 
Interrupt 
Controller" 
portion in 
Section 5 of this chapter. 


The 80960KB processor services the interrupt according to its priority. 
If the interrupt has higher 
priority than the current activity, the 80960KB processor services it immediately. 
Otherwise, after 
reading the interrupt vector, the 80960KB processor posts the interrupt vector in the interrupt table. 
Typically, the 80960KB processor responds within 4 usec for an interrupt with higher priority than 
the current process (assuming CLK2 at 40 MHz). If the interrupt has lower priority than the current 
activity, the interrupt is serviced when its priority is higher than the priority of the subsequent activity 
of the 80960KB processor. 


The 80960KB processor can also be interrupted by an lAC message. The 80960KB processor can 
send lAC messages 
to itself by using 
one of the Synchronous 
Move instructions. 
Because this 


message does not utilize the L-bus when sent to the same processor, no special hardware is required. 
More details are provided in the Programmer's 
Reference Chapter of this manual. 


The INT rfIAC, INTI' INT/INTR 
and INT) input signals can be either synchronous or asynchronous 


to the system clock (CLK2). Synchronous interrupt signals must be set up 3 ns prior to the rising edge 
of CLK2 and held for 10 ns after the rising edge of CLK2. To properly preset the interrupt signals 
for synchronous operation, INT rfIAC, INT, INT/INTR 
and INT) must be deasserted for at least one 


processor clock cycle and asserted 
for at least one processor clock cycle. These signals may be 
deasserted and asserted individually. 


If the interrupt signals are asynchronous 
to CLK2, the 80960KB processor internally synchronizes 


them. 
For the CPU to recognize the asynchronous 
interrupt input signals, they must be preset by 


deasserting 
them for at least two processor clock cycles, and then asserting them for at least two 
processor clock cycles. 


These signals may be deasserted and asserted individually. The 80960 interrupt controller 
intelli- 
gently manages 
interrupts. 
Once an interrupt 
is signalled, 
the 80960KB 
interrupt 
mechanism 
transfers control to a microcode 
interrupt routine. 
This 80960KB routine automatically 
allocates a 
new set of local registers onto the stack, posts pending interrupts, checks priorities, and suspends or 
aborts long instructions before executing the user's interrupt handler. Once the interrupt handler has 
completed, the return instruction "knows" it is a return from interrupt and the 80960KB return routine 
restores the local registers, arithmetic, and process control registers, checks for pending interrupts, 
and returns to the next instruction of the interrupted code. 
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There are two main stages the 80960KB 
goes through before it executes the interrupt handler: 
hardware 
recognizes 
the interrupt and then a microcode 
interrupt routine executives. 
First the 
interrupt pin is pulled. 
Hardware stores this in a four-bit register. 
One bit is assigned to each pin. 


This register 
is used to capture subsequent 
interrupts 
once one interrupt has been recognized. 


Interrupts 
are recognized 
at instruction 
boundaries 
or interruptable 
points in long instructions 
(floating point). They are then immediately disabled. 
However, it is important to note that disabling 
interrupts does not disable the four-bit register. Interrupts are saved in this register until microcode 
reaches a point it can check the register again. When the register is read it is subsequently 
cleared. 
The highest priority bit in the four-bit vector is cleared, which indicates that the interrupt vector 
associated with it will be used. Then this vector is written back to the register by an ORing function 
with the register thus maintaining 
any new interrupts that may have been signalled. 


Next the 80960KB recognizes that an interrupt occurred by the fact that an interrupt event has been 
stored in the four-bit register. At this point the interrupt microcode routine is called by a hardware 
mechanism 
in the interrupt controller. 
The interrupt routine executes the action described by the 
interrupt flow in Flow Chart 1. After the interrupt routine has completed, 
it "calls" the interrupt 
handler and commences 
executing 
instructions. 
The interrupt handler is user supplied. 
All the 
housekeeping 
needed to get into and out of the interrupt handler is completed 
by the 80960KB 
microcode 
interrupt routine before the interrupt handler is "called". 
No processor 
housekeeping 
activities need to be done by the user's interrupt handler. 


The 80960KB has only one "return" instruction for all types of returns. 
There are three bits in the 


"previous 
frame pointer" 
(local registerO) called the return 
status bit. 
See section 
4 of the 
Programmer's 
Reference Chapter of this manual. 
These bits have encoded in them the type of call 
and, therefore, the type of return that is to occur. The 80960KB manages this completedly. 


The flow diagrams show an interrupt flow, pending interrupt flow and interrupt return flow. Each 
of these are implemented 
as microcode routines in the hardware of the 80960KB. 


Pending interrupts are checked in certain situations. 
If a pending interrupt exists then the "pending 
interrupt" flow is executed. 
The four situations that pending interrupts are checked are as follows: 


Return form interrupt 


-OR- 


MODPC instruction (if process priority is lowered) 


-OR- 
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The 80960KB 
interrupt controller manages the interrupt mechanism 
automatically 
and therefore 
there are many cases it deals with. Dependong on the situation, latency may vary. The 80960KB's 
interrupt latencies are comprised of a base latency and special case latencies added to it. These special 
cases consist of such things as using an 8259 A interrupt controller, the local register cache being full, 
or an interrupt occuring while the processor is already in the interrupted state. 


The base interrupt latency is 85 cycles as shown in Table 5. Table 6 describes the breakdown of the 
base interrupt latency. Notice that is only takes 6 cycles for the 80960KB to respond to the interrupt. 
Four cycles for hardware recognition of the interrupt and a minimum of one cycle to respond if the 
interrupt occurs on an instruction boundary. The tableindicates 
two cycles and assumes the interrupt 
is signalled at the beginning of a RISe instruction. This value will differ depending on the instruction 
being interrupted 
and the point at which the interrupt is signalled in the instruction. Table 7 gives 
values for integer execution, floating point, and transcendental 
floating point instruction 
interrupt 
boundaries. 


Type of Latency 
Cycles 


Base Interrupt 
Latency 
85 


Return 
78 


Interrupt 
immediately 
followed 
by another 
157 
interrupt. 
Second 
interrupt 
posted to 
interrupt 
table. 


Return with a Pending 
Interrupt 
Posted 
157 


Pending 
Interrupt 
0 


Table 6. 
Constituent 
Parts of the Base Interrupt 
Latency. 
(The total base interrupt 
latency 
is 85 cycles 
or 4.25 us.) 


Constituent 
Latencies 


"" 
Cycles 


Hardware 
Recognition 


~" 
4 


Stop Current 
Instruction 
Flow Assuming 
a Risc Instruction 
2 


Determine 
Next IP and Save 
8 


Read Interrupt 
Vector 
Number 
18 


Check 
Interrupt 
Priority 
8 


Read Interrupt 
Table Vector 
14 


Check 
if Processor 
Already 
Interrupted 
6 


Save Process 
Control 
and Write Interrupt 
Record 
10 


Compute 
Interrupt 
Record Address 
of New 
Local Register 
Set 
10 


Allocate 
New Local Register 
Set 
3 


Fetch New Instruction 
and Start Decoding 
2 


Other situations that add to the latency are interrupts signalled at the start of a multicycle instruction 
or multiple interrupts signalled at the same time. The first may cause a resumption record to be stored 
on the stack. This records all the necessary inforamtion the 80960KB needs to resume executing the 
interrupted instruction. 
Not all interruptable 
instructions cause a resumption record to be created. 
If an instruction has been executing for over approximately 
520 cycles then a resumption record will 
be created. Less than that and the instruction is simply restarted upon return from the interrupt. This 
was an engineering 
trade-off between the overhead 
to save state after less than 520 cycles and 
restarting the instruction. 
Restarting the instruction requires fewer cycles for most cases. 
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Special 
Case Latencies 
Cycles 


8259A 
Interrupt 
Expansion 
(4ws) 
18 


Frame Cache 
Full 
24 


Current 
Process 
in "Interrupt" 
14 


Risc Instructions 
(Worst Case) 
3-4 


Integer 
Execution 
10-40 


- 


Floating 
Point 
12-96 


Transcendental 
Floating 
Point 
90 


Instruction 
Cache 
Miss (2 Wait State) 
5 


-' 


Multiple interrupts signalled at various times are handled on a first come first serve basis. Interrupts 
occurring at the same time are handled on a priority scheme with INT3<INT2<INT 
1<INTO. The first 
interrupt is handled as soon as the 80960KB reaches an interruptable 
state (e.g. end of instruction) 


and subsequent interrupts are read from the interrupt control register and posted in the interrupt table 
as soon as the microcode routine reinables interrupts. 
While interrupts are not enabled the event 
(another interrupt) is stored in the four-bit register described earlier. Posting a pending interrupt to 
the interrupt table adds about 60 cycles to the interrupt latency. 
This consists of comparing 
the 


priorities 
of the processor 
and interrupt, 
writing a "one" to the appropriate 
bits in the pending 


priorities field, if it is less than or equal to the current priority, and writing a "one" to the appropriate 
bits in the pending interrupt field in the interrupt table. The positions in the fields are pointed to by 
the index vector from the interrupt control register or an 8259A vector. 


The minimum 
interrupt latency is 85 clocks or 4.25 usec at 20MHz. 
This latency assumes the 


instruction handler is in the cache. 
If there is an instruction cache miss, five clocks for caching the 


instructions must be added to the base latency (assuming a two wait state memory system). In most 
cases the instruction will be cached already. A program's 
typical latency would add about 3 more 
clocks for non-RISe 
instructions. 
If there is a local register cache miss then 24 cycles or 1.2 usec 
should be added. The worst case interrupt latency would be 181 cycles or 9.05 usec. This assumes 
the interrupt is signalled at the beginning of an ediv instruction (40 cycles), there is a local register 
cache miss (24 cycles), the current process is in the "interrupt" 
state (14 cycles), and an 8259A with 
4 wait states is being used (18 cycles). 


It is important to note that during the microcode routine all of the stack manipulations, 
saving state, 


checking priorities, and allocating new registers is done automatically. 
When the 80960KB enters 


the user interrupt handler this routine does not have to do any housework, 
it can start immediately 


with useful code. The benefit is that this work is done by the processor in microcode and can be done 
quickly and efficiently. Also note that the 80960KB responds to an interrupt in as little as 6 clocks. 
This is from the point of interrupt pin assertion to the point that the instruction flow is stopped and 
the microcode routine starts the housekeeping 
tasks. Normally processors do not include any of the 
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Table 7 gives the latencies based on special cases that may occur. These values must be added to the 
base latency from Table 5. 


The system RESET signal provides an orderly way to start or restart the 80960KB processor. 
When 
the 80960KB 
processor 
detects the low-to-high 
transition 
of RESET, it terminates 
all 
external 
activities and places the output pins in the high impedance state or deasserted condition. When the 
RESET 
signal falls low again, the 80960KB processor begins the initialization 
process and later 


starts fetching instructions from a specific address. 


To properl y reset the 80960KB processor to a known state, the low-to-high transition of RESET must 
be asserted relative to any rising edge of CLK2 and remain asserted for at least 41 CLK2 cycles, as 
shown in Figure 27. RESET must be deasserted after the rising edge of CLK2, but prior to the next 
rising edge of CLK2. This establishes the next rising edge of CLK2 as edge A. 
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The RESET input signal to the 80960KB processor 
can easily be generated 
by implementing 
a 
synchronization 
circuit comprised of a two D-type flip-flops, as shown in Figure 28. 


The user RESET signal is synchronized 
with the CLK signal by applying CLK to the clock input of 
both flip-flops. To protect against a metastable RESET signal, the output ofthe first flip-flop, SYNC, 
is applied to the input of the second flip-flop. The output of the second flip-flop results in a processor 
RESET signal. The timing diagram forthese signals is shown in Figure 29. CLK or CLK2 can be used 
instead ofCLK in Figure 29. Using CLK provides an edgeAcorresponding 
to the rising edge ofCLK. 


This circuit 
assumes an asynchronous 
user RESET signal. 
If the user RESET signal is already 
synchronous 
with the CLK signal, 
the same circuitry can be implemented 
as shown in Figure 30. 


In this case, however, the output from the first flip-flop is used to generate the processor RESET 
signal rather than being routed to the input of the second flip-flop. 
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The initialization 
sequence of events is shown in Figure 31. When RESET is deasserted 
after a 
minimum of 41 CLK2 cycles, several actions take place: two input pins are sampled, the FAILURE 
output signal (see next 2 section for the pin description) 
is asserted, and the self-test is performed. 
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When RESET is deasserted, the 80960KB processor samples the signals residing on the INTO/IAC 
and the BADAC pins (see the next section for the pin description of BADAC). 
At this time, these 
pins are interpreted 
as the 
Local Processor 
Number 
(LPN) and Startup 
(STARTUP) 
signals. 
respectively. The LPN input signal defines whether the 80960KB processor is a PBM (high voltage 
input level) or a SBM (low voltage input level). 
The STARTUP input pin indicates whether the 
80960KB 
processor 
performs initialization 
(high voltage level) or not (low 
voltage level). The 
STARTUP signal is used to allow one or more rocessors to perform the active initialization. The input 
voltage levels for the LPN and STARTUP must be setup 3 ns before the rising CLK2 edge prior to 
edge A and held 10 ns beyond edge C, as shown in Figure 32. 


Besides sampling the two input pins, the 80960KB processor asserts the FAILURE output signal a 
few cycles after RESET is deasserted. 
The FAILURE 
signal remains 
asserted 
while the CPU 
performs the self-test. If a failure is detected during the self-test, FAILURE remains asserted and the 
CPU enters the stopped state where the processor does nothing. If the self-test completes success- 
fully, the CPU deasserts the FAILURE signal. 


An 80960KB 
processor 
that is designated 
as the initialization 
processor 
proceeds 
by doing a 
checksum test of eight words fetched 
from memory at physical address 0000 OOOOHto ensure that 
the memory and L-bus are operating properly. If the initial checksum is incorrect, then the FAILURE 
signal is asserted (and remains asserted) and the 80960KB processor enters the stopped state. After 
a successful checksum test, the 80960KB processor uses some of the words as addresses to initial data 
structures. Complete details are provided in the Programmer's 
Reference chapter. 


Just prior to executing 
the first instruction, 
the 80960KB processor 
clears any latched interrupt 


signals. 
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The 80960KB processor provides an input signal (BADAC) for notification of an error in the system, 
and provides an output signal (FAILURE) for notification 
of an error within the processor. 


When asserted, the Bad Access input signal indicates that an unrecoverable 
error occurred during the current data transfer. 
If, however, BADAC was 
asserted after a Synchronous 
Move or Synchronous 
Load instruction, 
the 
error is recoverable. 
The 80960KB processor samples the BADAC input 
signal during the cycle following the one when the last READY is asserted. 


The FAlLURE signal indicates that an error occurred during initialization. 
The 80960KB processor always asserts FAILURE after the activation ofthe 
RESET signal. If a failure is detected during a self-test, FAILURE remains 
asserted. Otherwise, the processor deasserts 
FAILURE after a successful 
self-test 
is performed. 
If the initial memory 
checksum 
is incorrect, 
the 
initialization 
processor 
asserts 
FAILURE 
a second 
time, and keeps 
it 
asserted. FAlLURE is an open drain output signal. 


The L-bus 
is a high speed 32-bit multiplexed bus with burst-transfer 
capability and is designed 
to 
operate with the high performance 
80960KB 
processor. The L-bus consists of two signal groups: 
address/data, 
and control. These signal groups are utilized by the 80960KB processor to perform 
read, write, and burst transactions. 


The arbitration, interrupt, and reset operations are related to the L-bus transactions. The arbitration 
operation transfers control of the L-bus to another bus master. Three methods are available to handle 
interrupts: by invoking the on-chip interrupt controller, by employing an external interrupt controller 
using the INTR/INTA 
signals, by using an lAC message. The reset function 
sets the 80960KB 
processor to a known internal state after it successfully completes the self-test. These operations offer 
power and flexibilityto 
hardware system design using the 80960KB processor. 


The high-speed 
bus interface 
has many features 
that enhance 
high-performance 
designs. 
In 
particular, 
the burst-transfer 
feature allows up to four successive 
32-bit data word transfers at a 
maximum rate of one word every processor clock cycle. This section outlines approaches for memory 
designs 
that use these features, describes memory design considerations, 
analyzes the timing, and 
lists a number of useful examples. The concepts illustrated by these examples apply to a wide variety 
of memory system implementations. 
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Figure 33 shows the major logic blocks of the memory interface circuit. The data transceivers buffer 
the data to compensate for any slow devices that may be connected to the 80960KB processor. The 
address latches demultiplex 
the address/data 
signals from the 80960KB processor 
and latch the 
address. The address decoder selects the appropriate 
memory device from the latched address. To 
accommodate 
a memory burst transaction, 
the burst logic decrements 
the word count, increments 


the local address lines 3 and 2 (LAD) and LADz), and generates a CYCLE-IN-PROGRESS 
signal. 


The timing control generates a READY signal and other specific signals required by a particular 
memory device. The byte enable latch stores the byte enable signals. 


Although not part of the basic memory interface, the DRAM controller, SRAM interface, DRAM, 
SRAM, and EPROM are included in Figure 33 for completeness. 
In a hardware system the DRAM, 


SRAM, and EPROM are typically located in separate subsystems. 


Although the memory interface circuit can be designed using programmable 
logic, gate arrays, or 
other custom logic, the examples use standard components wherever possible to illustrate the design 
concepts. 


Standard 8-bit transceivers can be used to provide isolation and additional drive capability for the L- 
bus. Transceivers 
can be used to prevent bus contention that can occur if some memories are slow 
to remove data from the L-bus after a read operation. For example, if a write operation follows a read 
operation, the 80960KB processor may drive the L-bus before a slow device has removed its output 
data, potentially causing a current spike on the power and ground lines. Transceivers, 
however, can 
be omitted if the data float time of the device is short enough and the load does not exceed the 
80960KB device specifications. 


The data transceivers can be controlled by two signals from the 80960KB processor: data transmit/ 
receive (DT/R) and data enable (DEN). DT/R indicates the direction of data flow and DEN enables 
the transceivers. 


Conventional 
transparent latches can be used to demultiplex the address/data lines of the 80960KB 
processor 
and to hold the address constant during the memory operation. The latch is controlled by 
the ALE signal from the 80960KB processor. ALE passes through an inverter, so that when ALE goes 
low, the address flows through the latch. The low-to-high transition of ALE can be used to latch the 
address. The output enable of the latch can be tied to ground. The lower four address lines (LAD3- 
LADo) are latched by the burst logic. 
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The 80960KB processor accesses' both memory and I/O devices by supplying a 32-bit address and 
a read/write command. The address decoder determines 
which particular memory or I/O device is 
selected by decoding the address lines. The following discussion 
focuses on memory selection, and 
the "Address Decoder" portion of Section 5 discusses I/O device selection using memory-mapped 
I/O techniques. 


The memory address can be divided into regions where one region can apply to EPROM or ROM, 
another to RAM, and another to the I/O registers. 
In a 80960KB-based 
system the ROM address 
space is likely to start at address 0000 OOOOHbecause the CPU begins execution at this address. The 
RAM or I/O regions can start at any other address in the 4G-byte address range except for addresses 
FFOOOOOOHthrough FFFFFFFF 
H' which the 80960KB processor reserves for inter-agent commu- 
nication. 


Because of the large address range of the 80960KB processor, the address can be divided into word 
address bits and chip select bits. Typically the higher-order address bits are decoded to generate the 
selection signal for ROM, RAM, or I/O devices. 


The address decoder can be located either before or after the address latches. 
Usually, it is placed 
after the latches, so that the chip-select signal does not need to be latched. Figure 33 shows the block 
diagram of the address decoder placed behind the address latches. 


To enhance system performance, 
the 80960KB processor performs burst transactions 
that transfer 
up to four data words at a maximum rate of one word every clock cycle. A DRAM controller can 
be designed that takes advantage of the burst-transfer 
capability by using the static column mode or 
nibble mode features of the DRAM (see the "DRAM 
Controller" 
in this section. 
This DRAM 
controller requires a signal, called CY CLE- IN-PROG RES S, to identify the start and end of a memory 
cycle. The burst logic generates the CYCLE-IN-PROGRESS 
signal. 


Figure 34 shows the flow chart for the burst logic. If ADS is low and DEN is high, then the burst logic 
latches LAD3 through LADo' and asserts the CYCLE-IN-PROGRESS 
signal. The burst logic checks 
the SIZE signals (LAD, and LADo)' 
If the value of the SIZE signals equal zero, then the burst logic 
runs one memory cycle, and terminates the CYCLE-IN-PROGRESS 
signal. If the value of the SIZE 
signals do not equal zen:>,the burst logic runs one memory cycle, increments the lower two latched 
address's (A2 and A3), and decrements the SIZE value. When this is finished, the burst logic checks 
the value of the SIZE signals again. 


The burst logic can be used with EPROM, SRAM, DRAM memories. 
However, it cannot be used 
in the DRAM static column ornibble 
modes, because they do not support burst transactions. Because 
the 80960KB processor ensures that a burst transaction cannot exceed four words or cross a 16-byte 
boundary, incrementing 
LAD3 and LAD2 after a single data word transfer makes the burst transfer 
transparent to the memory devices. 
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The timing control logic accommodates 
memory devices that cannot transfer information 
at the 
maximum bus rate by inserting wait states until the data becomes available. The timing control logic 
consists of a counter and timing logic, as shown in Figure 35. The counter produces a 4-bit binary 
count. 
The count begins when the CYCLE-IN-PROGRESS 
signal is asserted. 
The timing logic 
asserts READY at the appropriate time based upon the count, the EPROM-CS, 
and the SRAM-CS 
signals. For a burst transfer, READY resets the counter to properly time a READY signal for the next 
data transfer. 
When CYCLE-IN-PROGRESS 
is deasserted, the clock counting is terminated. 


Because the timing of DRAM is more complicated, 
the DRAM controller generates a DRAM-RDY 
signal to the timing control logic. In addition, the clock count, the W/R. command, and SRAM-CS 
signal can also be used to generate SRAM-WE and SRAM-OE Signals. 
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The byte enable latch holds the byte enable signals constant until the DRAM controller or SRAM 
interface uses the signals. As mentioned in the "L-Bus Signal Groups" section in Section 3, the byte 
enable signals specify which bytes (up to four) on the 32-bit data bus are transferred during the data 
cycle. 
Each individual byte enable signal selects eight data lines as shown in Table 5. 


Byte Enable Signal 
Address LIne Selection 


BEe, 
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The byte enable signals are valid from the 80960KB processor before data is transferred. 
These 
signals are asserted during the address cycle for the first data word transfer; they are asserted again 
during the first data cycle for the second word transfer; the second data cycle for the third word 
transfer; and the third data cycle for the fourth word transfer. For each word, the byte enable signals 
remain valid throughout every data or wait cycle until READY is asserted. After READY is asserted, 
the byte enable signals change during the next processor clock cycle. 
. 


The ALE signal can be used to latch the first byte enable signals. 
READY can be used to latch the 


other byte enable signals for each word. 
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The basic memory interface can be used in conjunction with the SRAM interface to read and write 
to SRAM. 
This section describes the SRAM interface and examines the timing. 


The SRAM interface logic uses the latched byte enable signals, the SRAM-OEo' and the SRAM- WE 
signals to generate four output enable signals (SRAM-OE3 
through SRAM-OEo) 
and four write 
enable signals (SRAM-WE3 
through SRAM-WEo)' 
as shown in Figure 36. These signals allow the 
80960KB processor to write to the data byte that is specified by the byte enable signals. SRAMs with 
separate 
OE and CS signals require only one OE signal per bank since the 80960KB 
ignores 
unrequested 
bytes in read operations. 


This section analyzes the critical timing paths of the SRAM control signals. From the critical path, 
the timing equations can be derived to determine the memory access time for no wait state operation. 


When evaluating 
critical timing paths, the timing calculations 
should use worst-case 
data sheet 
parametric 
specifications, 
rather than typical specifications. 
By using worst-case 
timing values, 
reliable operation 
is assured over all variations 
in temperature, 
voltage, 
and individual 
device 
characteristics. 
These timing values are determined by assuming the maximum propagation delay to 
latch an address, select a memory device, and pass through data buffers and transceivers. 


Figure 37 shows the critical timing path for a one-word SRAM read operation. The diagram consists 
of three time periods: the address setup period (Taddrse,)'the memory response period (Tmem)'and the 
data return period (Tda,aset)'Note that the timing for the read command and output control signals does 
not enter into the critical timing path. 


During the Taddrse,period, the 80960KB processor outputs a valid address that is latched on the low- 
to-high transition of the ALE signal. The address decoder generates the SRAM-CS signal from the 
latched address and the Timing Control/SRAM 
Interface 
logic subsequently 
generates 
the OE 
signals. 
During the Tmemperiod the SRAM responds to the commands and signals and retrieves the 
data. The access time of the memory determines the duration of the T 
period. T 
can be varied 
in increments of clock cycles by delaying the READY signal. 
mem 
mem 


The data must be available at the address/data pins of the CPU before the end of the data state. The 
Tda,ase,period must take into account the setup time requirement of the 80960KB processor and the 
throughput delay of a data transceiver. 
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For a no wait state operation, the data transfer word must be completed in two system clock (CLK) 
cycles. 
The minimum time period for a no wait state operation (Tmem-oo-wail) 
can be determined 
by 
using equation 
1. 


T 
. 
mem-no-wall 
=2CLK 
-T 
-T 
addrset 
datasel 
= Memory access time for no wait state operation 


= Two system clock (CLK) cycles 


= Maximum delay to valid address 
+ Maximum throughput 
delay of address latch 
+ Maximum delay to generate chip select 
+ Maximum delay to generate SRAM-OEn 


= Maximum delay through data transceiver 
+ Maximum data setup time of CPU 


A similar analysis can be done for burst transactions. 
Equation 1can be used to determine the access 


time for no wait state operation of the first word. For subsequent words, equation 2 can be used. In 
this equation, the address setup time is replaced by delay in the burst logic to change the address 
(Tburst). 
In this cas~, the data transfer of each subsequent word must be completed in one system 
clock (CLK) cycle (no address state). The minimum access time for a no wait state operation (Tmem- 
no-wait) can be determined 
by using the lesser value of equation 1 or equation 2. 


= Memory access time for no wait state operation 


= One system clock (CLK) cycles 


= Maximum delay to change the address 


= Maximum delay through data transceiver 


+ Maximum data setup time of CPU 


TburS! 


Tdataset 


The timing analysis described 
for a SRAM read operation 
can be used for EPROM timings. 
If 
EPROMs 
are only used to store initialization 
programs, 
they are seldom accessed compared 
to 
memory devices used to store program data or instructions. 
Consequently, the addition of wait states 


dUIing the read cycle does not affect overall system performance. 


Figure 38 shows the critical timing path for an SRAM write operation. 
The diagram consists of two 
time periods: the address setup period (Taddrse,)and the memory response period (Tmem). 


During the Taddrsetperiod, the 80960KB processor outputs a valid address that is latched on the low- 
to-high transition of ALE. 
The address decoder generates the SRAM-CS 
signal from the latched 
address. 
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During the Tmem period the SRAM responds to the commands and writes the data. The access time 
of the memory determines the duration of the Tmem period. Tmem can be varied in increments of clock 
cycles by delaying the READY signal. 


Two timing paths should be considered during the Tmem period: the path where data is supplied to 
the memory, and the path that monitors the memory write cycle time. The first path takes into account 
the time for the 80960KB processor to generate valid data, the throughput delay of a data transceiver, 
and the data setup time requirement 
of the memory. 
The second path is the memory write cycle 
specification. 
The longer of the two paths is the critical timing path. 


By examining 
the timing path required to operate the SRAM, equation 2 can be derived which 
determines SRAM write cycle time for no wait state operation. The memory cycle time is determined 
by the lesser value of equation I or equation 2. 


T mem-on-wait 
= 2CLK - Taddrse, 
(3) 


where: Tmem-on-wait 
=> Maximum delay to valid data 


+ Maximum throughput delay of data transceiver 


+ Maximum data setup time of memory 


= Two system clock (CLK) cycles 


= Maximum delay to valid address 


+ Maximum throughput delay of address latch 


+ Maximum delay to generate chip select 


This section provides design guidelines for a DRAM controller. 
DRAMs offer static column mode 
and CAS before RAS refresh features. 
This section shows guidelines on how to use these features 
with the burst capability of the 80960KB processor to significantly 
enhance system throughput. 


The DRAM controller multiplexes the address into a row and column address, performs the refresh 
operation, 
arbitrates between a refresh request and memory request, and generates the necessary 
control signals for the DRAM. To implement these functions, the memory controller uses an address 
multiplexer, 
arbiter, refresh interval timer, and DRAM timing and control as shown is Figure 39. 
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The address multiplexer 
divides the DRAM address into a row and column address. 
The proper 
selection of a row or column address is accomplished 
by the row/column 
select signal (ROW /COL) 
from the DRAM timing and control circuit. 


The refresh interval timer periodically generates a refresh request (REF-REQ) by counting enough 
bus cycles to equal the refresh interval period. Since a refresh request is processed after a completed 
operation, the refresh period must take into account the time required to perform a bus operation, as 
well as the DRAM refresh specification. 
For example, a 1M-bit DRAM that requires 512 refresh 
cycles within 8 ms needs a refresh cycle every 15.6 us. To meet the DRAM specification, 
the refresh 
interval timer must generate a refresh request in less than 15.6 us to compensate for any required time 
to complete the operation with wait states. 


After the REF-REQ signal is generated, the arbiter sends a refresh acknowledge 
signal REF-ACK 
back to the interval timer to assure that refresh occurred before generating another REF-REQ. 


DRAM controller uses an arbiter to decide whether a memory cycle or refresh cycle is performed. 
In a synchronous design, arbitration is easily performed because memory and refresh cycle requests 
never occur at or near the same time. 


The arbiter monitors memory cycle requests and refresh requests. 
The arbiter detects a DRAM 
memory request by decoding two signals: DRAM-CS and CYCLE-IN-PROGRESS. 
The REF-REQ 
signal indicates that a refresh cycle must be performed. 
The arbiter arbitrates between a memory 
cycle or refresh cycle and generates a Memory /Refresh (MEM/REF) signal. The DRAM timing and 
control block uses the MEM/REF 
signal to start the generation of the control signals. 


When a refresh cycle is performed, the arbiter sends a REF-ACK signal to the refresh timer, which 
uses this signal to begin another count. 


The DRAM timing and control circuit is the final logic block and core of the DRAM controller. The 
functions of this circuit include the following: 


Generating the DRAM control signals (RAS, CAS, and WE) with the proper timing rela- 
tionships during system operation 


Generating 
the DRAM-RDY 
signal 


Performing 
the refresh function by asserting CAS before RAS 


Performing 
several warm-up cycles required by the DRAM when power is first applied. 


The DRAM 
timing and control 
logic can be designed 
to take advantage 
of the burst-transfer 
capability of the 80960KB processor by implementing 
static column mode or nibble mode. 
With 
nibble mode, a multiplexed address is applied to the DRAM, and up to four bits of data are quickly 
transferred 
by successively 
toggling the CAS pulse. 
The DRAM timing and control logic can be 
designed to provide the successive CAS pulses by using the CYCLE-IN-PROGRESS 
and DRAM- 
RDY signals. 


Static column mode can also be used to take advantage of the burst capability of the DRAM. 
Static 
column mode allows fast access to the bits located in the selected row of the DRAM simply by 
changing the column address after the first access. 


Figure 40 shows a flow chart for the DRAM timing and control logic using static column mode. The 
DRAM timing and control circuit receives a refresh request or a memory request on the MEM/REF 
and CYCLE-IN-PROGRESS 
input signals. 
For a memory request, the DRAM timing and control 
determines 
whether a read or a write operation is desired from the WiR signal from the 80960KB 
processor: 


For a read operation, the DRAM timing and control logic performs similar functions on the first word: 
it asserts WE; it brings ROW/COLhigh 
to select a row address; it asserts RASo; it brings ROW/COL 
low to select the column address; it asserts CAS3 through CASo (derived from the four latched byte 
enable signals); and it generates a DRAM-RDY 
signal. 
The DRAM-RDY 
signal causes the burst 
logic to increment the address and informs the 80960KB processor that the data word was written. 


After completing 
these functions 
the DRAM timing and control logic samples the CYCLE-IN- 
PROGRESS to determine whether to transfer another data word. If so, the DRAM timing and control 
logic maintains the ROW/COL 
signal low to select the new column address, deasserts and asserts 
CAS3 through CASo to observe the CAS precharge specification of the DRAM, and generates another 
DRAM-RDY. 
The DRAM timing and control logic repeats the procedure until all the data words are 
transferred. 
Then the DRAM timing and control logic deasserts 
RASo' 


For a write operation, the DRAM timing and control logic performs similar functions on the first 
word: it asserts WE; it brings ROW/COL high to select a row address; it asserts RASo (derived from 
the four latched byte enable signals); and it generates a DRAM-RDY 
signal. 
The DRAM-RDY 
signal causes the burst logic to increment 
the address and informs the 80960KB 
processor 
by 
asserting READY that the data word was written. 


After completing 
these functions 
the DRAM timing and control logic samples the CYCLE-IN- 
PROGRESS 
to determine 
whether the 80960KB 
wants to transfer another data word. If so, the 
DRAM timing and control logic maintains 
the ROW/COL 
signal low to select the new column 
address, deasserts and asserts CAS3 through CASo to observe the CAS precharge specification of the 
DRAM. 
and generates 
another DRAM-RDY. 
The DRAM timing and control logic repeats the 
procedure until all the data words are transferred. 
Then the DRAM timing and control logic deasserts 
RASo' 


Although only one RAS signal is required, four CAS signals (CAS3-CASo) 
are generated to enable 
each byte of the L-bus. These CAS signals are generated by the byte enable decoder and correspond 
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to the byte enable signals of the 80960KB processor. 
For example, CASo' which is mapped directly 
from BEo' selects the least-significant 
data byte (LAD7-LADo). 


SAMPLE MEM/REF 
AND 
CYCLE-IN-PROGRESS 
INPUTS 


1. GENERATE 
ROW 
ADDRESs...- 


2. 
ASSERT RASo 


3. 
GENERATE 
COLUMN 
ADDRESs...- 
_ 
4. 
ASSERT CAS3-CASo 


5. 
ASSERT DRAM-ROY 


1. CHANGE COLUMN 
ADDRESS 
_ 
2. 
ASSERT DRAM-ROY 


1. GENERATE 
ROW 
ADDRESs...- 
_ 
2. 
ASSERT RASo AND WE 
3. 
GENERATE 
COLUMN 
ADDRESs...- 
_ 
4. 
ASSERT CAS3-CASo 
5. 
ASSERT DRAM-ROY 


ASSERT CAS3-CASo 
BEFORE RASo 


1. DEASSERT 
CAS3-CASo 


2. 
CHANGE COLUMN 
ADDRESs...- 
_ 
3. 
ASSERT CAS3-CASo 
4. 
ASSERT DRAM-ROY 
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A single WE control signal and four CAS signals ensure that only those DRAM bytes selected for 
a write cycle are enabled. 
All other data bytes maintain their outputs in the high-impedance 
state. 


A common design error is to use a single CAS control signal and four WE control signals, using the 
WE signals to write the DRAM bytes selectively in write cycles that use fewer than 32 bits. Although 
the selected bytes are written correctly, the unselected bytes are enabled for a read cycle. These bytes 
output their data to the unselected bytes of the data bus while the data transceivers output data to every 
bit of the data bus. When the two devices simultaneously 
output data to the same bus, bus contention 
occurs. 


The refresh function can be performed by asserting the CAS signal before asserting RAS. The CAS 
before RAS refresh feature eliminates the need for an external refresh address counter. 
When the 
CAS pulse is activated prior to the assertion of the RAS pulse, the DRAM automatically 
performs 
a refresh cycle on one row by employing an on-chip address counter. Upon completion of the refresh 
cycle, the address counter is automatically 
incremented. 
The MEM/REF signal from the arbiter can 
be used by the DRAM timing and control logic block to initiate a CAS before RAS refresh cycle. 


Besides generating the RAS, CAS, and WE signals, the DRAM timing and control logic generates 
a number of warm-up cycles for the DRAM after reset by issuing several refresh requests. 


Figure 41 shows a typical example of a timing diagram for a two-word read transaction 
that uses 
static column mode; similarly, Figure 42 is a typical example for a two-word write transaction. 
The 
example assumes a memory access time that requires two wait states (Tw) for the initial data word 
and one wait for the second data word. 


The critical timing areas for both read and write transactions 
are noted by circled numbers in the 
diagrams, which are enumerated 
below. 


1. 
The delay for the CPU to generate a valid address. 


2. 
The delay for the DRAM timing and control logic to generate the CYCLE-IN-PROGRESS 
signal. 


3. 
The delay to generate the DRAM row address. This time includes the address latch throughput 
delay, the multiplexer 
throughput delay, and the address driver delay. 


4. 
The delay to generate RAS, which includes the delay to generate the DRAM-CS 
signal. 


5. 
The row address hold time after the high-to-Iow transition of RAS. 


6. 
The time required to generate the multiplexer control signal (ROW ICOL) after the row address 
hold time is satisfied. 


7. 
The time required to switch from a row to column address plus any driver delays. 


8. 
The delay to generate and drive the CAS signals. 


9. 
For a read transaction, the throughput delay of the data transceivers. 
For a write transaction, the 
delay by the CPU to generate valid data. 
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10. For a read transaction, 
the data setup time of the CPU. For a write transaction, 
the throughput 


delay of the data transceivers. 


II. 
The time required to increment and drive the column address. 


12. For a write transaction only, the delay time to bring CAS high (terminate the CAS pulse for the 


first data byte), to precharge the CAS pulse (required by the DRAM), and to assert CAS again. 


13. The RAS precharge time, which must be satisfied before another memory cycle can begin. 
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Because the DRAM consists of dynamic nodes, a row precharge time is required to recharge the 
nodes after every memory cycle. This time must be included in the timing evaluation, 
as noted by 


the example. To avoid the precharge time delay of the DRAM, the memory array can be arranged so 
that each subsequent 
memory 
access is most likely to be directed 
to a different 
bank. 
In this 


configuration, 
wait time between 
accesses 
is not required 
because 
while one bank of DRAMs 


performs 
the current access, another bank precharges 
and is ready to perform 
the next access 
immediately. 


If DRAMs are interleaved (i.e., arranged in multiple banks so that adjacent addresses are in different 
banks), the DRAM precharge time can be masked for most accesses. 
With two banks of DRAMs, 


one for even 32-bit addresses and one for odd 32-bit addresses, all sequential 32-bit accesses can be 
completed without waiting for the DRAM to precharge. 


Even when random accesses are made, two DRAM banks allow 50 percent of back- to-back accesses 
to be made without waiting for the DRAMs to precharge. 
The precharge time is also masked when 


the 80960KB processor has no bus accesses to be performed. 
During these idle bus cycles, the most 
recently accessed DRAM bank can precharge so that the next memory access to either bank can begin 
immediately. 


The memory 
interface circuit allows the 80960KB 
processor 
to communicate 
with the memory 
devices. 
The basic memory interface logic can be divided into six blocks: the data transceivers, 
the 
address latches, the address decoder, the burst logic, the DRAM timing and control logic, and the byte 
enable latch. The DRAM controller and SRAM interface complete the memory interface circuit. The 
DRAM controller can be designed to take advantage of the 80960KB processor's 
burst capability to 
enhance system performance. 


The 80960KB processor supports 8-bit, l6-bit, and 32-bit I/O devices by mapping them into its 4 G- 
byte memory 
address space. 
This section describes 
the design considerations 
for the interface 
between 
the 80960KB 
processor 
and I/O components. 
Several examples 
illustrate 
the design 
concepts. 


The 80960KB processor accesses I/O devices by using a memory-mapped 
address. 
Consequently, 


memory-type 
instructions 
can be used to perform 
input/output 
operations. 
For example, 
the 
80960KB processor's 
LOAD and STORE instructions 
can directly support 8-bit and 16-bit data 
moves to or from I/O peripherals. 
The instructions 
include those listed below. 


Load Ordinal Byte (reads a byte) 


Load Ordinal Short (reads l6-bit data) 


Store Ordinal Byte (writes a byte) 


Store Ordinal Short (writes l6-bit data) 


These instructions perform the transfer on the data bits specified by the two low-order lines of the 
effective address. 
See the 80960KB CPU Programmer's 
Reference Manual for complete details. 


In a typical 80960KB processor 
system design, a number of slave I/O devices can be controlled 
through a general system interface. 
Other I/O devices, particularly those capable of controlling the 
L-bus, can use the general system interface, but may require additional logic to isolate the bus. This 
section describes the general system interface and assumes that the 80960KB processor 
does not 
perform burst transactions 
to the I/O devices. 


intel 


Figure 
43 shows the major logic blocks 
of the general 
system 
interface. 
Standard 
8-bit data 
transceivers 
dd drive capability, provide bus isolation, and prevent bus conflicts that may occur with 
slow I/O components. 
The address latch demultiplexes 
the address/data lines and holds the address 
stable throu hout the L-bus transaction. 
The address decoder generates the I/O chip-select 
signals 
from the latc ed address lines. The timing control block provides the READY signal to the 80960KB 
processor and the I/O read and I/O write command. 


This basic in erface circuit is quite similar to the one used in the basic memory interface described 
in section 4. For most systems the same data transceivers, 
address decoders, and address latches can 
be used to a cess both memory and I/O devices. The timing control logic can be implemented 
to 
accommodate 
both memory and I/O devices. 


Standard 8-b t transceivers can be used to provide isolation and additional drive capability for the L- 
bus. Transce vers prevent bus contention that can occur if some devices are slow to remove data from 
the data bus 
fter a read cycle. 
For example, 
if an I/O write cycle follows a I/O read cycle, the 
80960KB pr cessor may drive the L-bus before a slow device has removed its outputs from the bus, 
potentially 
causing a current spike. Transceivers, 
however, can be omitted if the data float time of 
the device is hort enough and the load does not exceed the 80960KB device specifications. 


The data tran ceiver can be controlled by two signals from the 80960KB processor: Data Transmit/ 
Receive (DT 
) and Data Enable (DEN). DT;R indicates the direction of data flow and DEN enables 
the transceiv 
rs. 


Standard transparent 
latches can be used to demultiplex 
the address/data 
lines of the 80960KB 
processor. The latch is controlled by the ALE signal from the 80960KB processor. The ALE signal 
passes through an inverter, such that when ALE goes low, the address flows through the latch. The 
low-to-high 
transition of ALE can be used to latch the address. 


If only slave-type peripherals are used in a system, the output enable of the latches can always remain 
active by connecting it to ground. For systems with DMA devices, the output enable can be used to 
permit the DMA device to drive a common address bus. 


The address decoder determines which particular I/O device is selected by decoding the address. The 
I/O address can be any address in the 4 Gbyte address range except for the upper 16Mbytes (addresses 
FFOOOOOOHthrough FFFFFFFF H)' which the 80960KB processor reserves for inter-agent commu- 
nication and internal I/O. 
Typically, a small range of address bits are reserved for accessing I/O 
devices by defining certain higher-order 
address bits as an I/O access. 
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As an example, consider a 32-bit address: A31through A1scould indicate an I/O access when A31is 
set to zero, 
nd A30-A1Sare set to one; AI4 through As could then be used to specify a particular I/O 
device; and A4 through A2 can be used to access up to 8 registers of the I/O component. 
A I and A are 


not used by the I/O device. This particular scheme selects up to 1,024 devices, while using only 32K 
bytes of the available 4 Gbytes of address space. 


The addres 
decoder can be located either before or after the address latches. 
Usually, it is placed 
after the latches, so that the chip-select signal does not need an additional latch. 


The timing 
ontrollogic 
accommodates 
I/O devices that cannot transfer information at the maximum 
bus rate by inserting Wait States until the data becomes available. 
The timing control logic consists 


of a counter and timing logic, as shown in Figure 44. The counter produces a 4-bit binary count. The 
count is started at the beginning of the operation (determined by ADS and DEN) and is stopped by 
the READY signal. The timing logic asserts the READY signal, the I/O write command (1/0-WR), 
and the I/O read command (I/O-RD) based upon the clock count, the I/O chip select signal (I/O-CS), 
and the WiR command. 


COUNTER 


START 
CYCLE 


COUNTO 


COUNT1 


COUNT2 


COUNT3 


READY 


I/O-RD 


I/O-WR 


For many peripherals, the timing logic can be programmed to assert READY at the appropriate count 
for the select d device. Specific I/O chip select signals can be used to indicate how many clock cycles 
to wait before asserting READY. 


For some I/O peripherals, particularly bus masters, READY cannot be determined by counting clock 
cycles. 
For these I/O devices, READY can be supplied by the device and passed on to 80960KB 
processor. 


The timing control block can assert the I/O-RD or I/O- WR signal for I/O devices based upon the clock 
count. The ti\TIingfor these signals can be selected for the slowest device to simplify the logic circuit 
or can be cus omized for each individual peripheral device to maximize performance. 
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The general system interface shown in Figure 43 can be used to connect the 80960KB processor to 
many slave peripherals. 
The following list includes some common peripherals compatible with this 
interface: 


8259A Programmable 
Interrupt Controller 


8253, 8254 Programmable 
Interval Timer 


8272 Floppy Disk Controller 


82062, 82064 Fixed Disk Controller 


82510, Asynchronous 
Serial Controller 


8274, 82530 Multi-Protocol 
Serial Controller 


8255 Programmable 
Peripheral Interface 


8041,8042 
Universal Peripheral Interface 


This section provides guidelines and design considerations 
for interfacing the 80960KB processor 
to different types ofl/O configurations. 
Specifically, four design examples are examined. The 8259A 
design example shows how to interface the 80960KB processor to a slave-type peripheral device. 
The 82586 design example 
shows how a 16-bit bus master reads and writes to the 80960KB 
processor's 
system memory. The 82786 design example shows how the 80960KB processor can read 
or write to graphics memory using a 16-bit data bus. 


The 8259A Programmable 
Interrupt Controller is designed for use in interrupt-driven 
microcom- 
puter systems, where it manages up to eight independent 
interrupts. 
The 8259A handles interrupt 
priority 
resolution 
and returns an 8-bit vector to the 80960KB 
processor 
during 
an interrupt- 
acknowledge 
cycle. Intel Application Note AP-59 contains detailed information on configurations 
of the 8259A. 


Figure 45 shows the connection 
of the 80960KB processor to a single 8259A Interrupt Controller. 
This circuit consists ofthe general system interface plus a bidirectional buffer. The example assumes 
that several interrupt requests occur at the same time so that priority resolution is required. 


The data lines from the 8259A are not directly aligned to the 80960KB processor because of the 
difference in priority resolution between the devices. 
Although both devices use an 8-bit interrupt 
vector, the 80960KB processor implicitly defines the priority by dividing the interrupt vector by 
eight. 
The 8259A defines the priority in the lower three bits of the interrupt vector. Furthermore, 
the highest priority vector of the 80960KB processor has a value of 31 in the upper five bits of the 
interrupt vector. Whereas, the highest priority interrupt of the 8259A has a value of 0 in the lower 
three bits of the interrupt vector. 
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. 
left by three bits as shown by the data alignment between the 80960KB processor 
and 8259A in 
Figure 45. Rotating the data bits in this manner provides two advantages: the interrupt table for the 
8259 A can be located by contiguous addresses, and the upper two most significant bits of the interrupt 
vector remain free to group interrupt vectors if additional 8259As are needed. 


Care must be exercised, 
however, when programming 
the registers of the 8259A. 
For example, 
assume that the second initialization 
command word (ICW2 register) of the 8259A requires a data 


byte value of 00011111 B' To transfer the correct information, the 80960KB processor needs to write 
a data word with the value of 00000111 B because this word is rotated left three places and inverted. 


The 8259A starts the interrupt cycle by generating 
an interrupt request (INT) to the 80960KB 
processor, 
which receives the signal at the INTR input pin. 
This assumes the Interrupt Control 
register of the 80960KB processor is set to accommodate 
an external interrupt controller. 


When the 80960KB processor comes to a breakpoint in its execution, it asserts the INTA signal twice. 
The first INTA signal acknowledges 
the interrupt request and causes the 8259A to prioritize 
the 
interrupt requests it received up to this point. The INTA, together with the 8259A-CS, are applied 
to the timing control logic to generate a READY signal. 


The 80960KB processor 
automatically 
asserts the second INTA signal five clock cycles after the 
assertion of READY. After the second assertion ofINTA, the 80960KB processor reads the interrupt 
vector from the 8259A. 


The bidirectional 
buffer inverts and passes the 8-bit vector to the 
80960KB processor 
with the 
appropriate lines rearranged. 
The output enable signal for the data buffer is controlled by INTA for 
this operation. 
After the data transfer is completed, 
the timing control circuit generates a second 


READY signal to terminate the interrupt acknowledge 
cycle. 


The same circuitry can be used to read or write to the 8259A registers. 
In this case, the 80960KB 


processor selects the 8259A through a memory-mapped 
address. 
Local address line 2 (A2) selects 


one of two internal registers of the 8259 A. The I/O read or I/O write command is generated by the 
timing control circuit. The data passes through the bidirectional 
data buffer to or from the selected 
register of the 8259A. 


The direction of data flow through the buffer is controlled by three logic gates shown in Figure 45. 
For an I/O write operation, the I/O Write command and 8259A-CS signal control the output enable 
signal of the bidirectional 
buffer. 
Similarly, for a read operation, the I/O Read command 
and the 
8259A-CS 
signal control the output enable signal of the buffer. 
After the data is transferred, 
the 
timing control circuit asserts READY. 
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The 82530 Serial Communication 
Controller is a dual-channel, 
multi-protocol 
controller with on- 
chip baud rate generators, digital phase locked loops, various data encoding/decoding, 
and extensive 
diagnostic capabilities. 
The 82530 is designed to interface with high-speed serial communications 


lines using a variety of communication 
protocols, including asynchronous, 
synchronous, and HDLC/ 
SDLC protocols. 
The 82530 contains two independent 
full-duplex 
channels. 


The general system interface circuit previously 
described 
can be used to connect the 80960KB 
processor 
to the 82530, as shown in Figure 46. 
The 82530 can send an interrupt request to the 
80960KB processor as shown or it can send the interrupt request to an interrupt controller, which in 
turn sends it to the 80960KB processor. The 80960KB processor reponds to the interrupt request and 
issues an address. After the address is latched, the address lines are decoded to generate a chip-select 
(82530-CS) 
signal to activate the 82530. 
. 


The lower two address lines, A2 and A3, are used for channel selection and command/data 
selection. 
A2 is connected to the Channel-A/Channel-B(AIB) 
select input pin. This selects the channel that 
peIforms the serial read or write operations. A3 is connected to the Data/Command 
(D/e) select input 
pin. This siganl defines the type of information transferred to or from the 82530 on the data lines (D7 
through DO). A high level means data is transferred; 
a low level indicates a command. 


The timing control circuit generates an I/O read or I/O write command based on the Wif{ command 
from the 80960KB processor. 
When the data transfer is completed, the timing control circuit sends 


a READY si nal to termiante the transaction. 


The 82586 is an intelligent, high-performance 
communications 
controller designed to perform most 
tasks required for controlling access to a local area network (LAN), such as Ethernet or Starlan. 
In 
many applications, 
the 82586 is the communication 
manager for a station connected 
to a LAN 
controller. 
S ch a station usually includes a host CPU, shared memory, a Serial Interface Unit, a 
transceiver, 
and LAN controller 
link, as shown in Figure 47. 
The 82586 performs 
all functions 


associated with data transfer between the shared memory and the LAN link, including: 


Framing 


Link management 


Address filtering 


Error detection 


Data encoding 


Network management 


Direct memory access 


inter 


Buffer chaining 


High-level 
(user) command interpretation 


The 82586 has two interfaces: a 16-bit bus interface and a network interface to the Serial Interface 
Unit. The bus interface is described here. For detailed information on using the 82586, refer to the 
Local Area Networking 
Component 
User's Manual. 
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There are several ways to design an interface between the 82586 and the 80960KB processor. 
The 
chosen design example shows how to interface the 82586 using a shared bus. In this example, the 
82586 operates in minimum mode at one-half the processor clock frequency. 


The primary function of the interface circuit is to allow the 82586 to read and write 16-bit data using 
the 32-bit L-bus. This is accomplished 
by adding the high-order address lines and translating the 16- 
bit data lines to the 32-bit data lines by using byte enable signals. 


Figure 48 shows the 82586 interface circuit, which includes the DRAM controller (see the "DRAM 
Controller" 
section in Chapter 4. This interface uses the general system interface circuit plus other 
logic units that specifically pertain to the 82586: the LAN data transceivers, the byte enable converter, 
and the LAN address latches. 
These logic blocks are highlighted 
by the shaded boxes. 
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The LAN data transceivers connect 16 data lines from the 82586 to both the upper and lower 16 bits 
of the L-bus. 
The data transfer is controlled by converting A, AI, and the BHE to four byte enable 
signals as shown in Figure 49. AI selects between the upper and lower 16-bit data lines; A selects 
the lower data byte for either the upper or lower 16-bit data lines; and the byte high enable signal 
(BHE) selects the upper data byte for either the upper or lower 16-bit data lines. Data flows through 
the buffers when the appropriate 
byte enable signal is asserted. The direction of the data flow is 
controlled by the DTIR signal of the 82586. 


The LAN address latches are used to demultiplex ADIS through AD. The address lines and BHE are 
latched by the ALE signal from the 82586. The upper address lines (A31 through A,6) are generated 
by hardware programmable 
DIP switches. 


The 82586 begins operation when the Channel Attention (CA) input signal is asserted. 
This signal 
is generated by gating the write command and 82586 chip select signal. 
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The interaction between the 82586 and the 80960KB processor is described below and is summarized 
in Figure 50. 


The 80960KB processor invokes the 82586 by supplying a memory-mapped 
address and a write 
command. 
The memory-mapped 
address results in a 82586-CS signal, which is gated with a 
write command to produce the CA signal. 


The 82586 responds by generating a hold request and waits for HLDA. 


The 80960KB processor asserts HLDA, which enables the outputs of the LAN address latches 
and disables the outputs of the address latches next to the 80960KB processor. The HLDA signal 
also gives control of the L-bus to the 82586. 


After the 82586 takes control of the bus, it generates a 16-bit address (AD 15 through AD), an ALE 
signal, and a BHE signal. 
The upper address lines are provided by the programmable 
DIP 
switches to produce an address on the L-bus. 


AI and A (from the 82586), and BHE are decoded to generate four byte enable signals (BE) 
through 1m). DEN enables the output of the byte enable converter. 


DTtR from the 82586 controls the direction of the data flow through the buffers. 


The read or write signal from the 82586 is applied to the DRAM controller. 


The 82586 accesses DRAM by using the DRAM controller. 


The DRAM-RDY 
is asserted by the DRAM controller. 
This action enables the output of the 
LAN data transceiver and terminates the 82586 memory cycle. The timing control logic passes 
the DRAM-RDY 
signal as the READY signal to the 82586. 


The 82586 deasserts HOLD and the 80960KB 
processor 
deasserts HLDA. 
The 
80960KB 
processor regains control of bus. 
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The 82786 is a high performance 
graphics coprocessor that provides high quality text and advanced 
display control. 
It provides full support for graphics primitives at up to 25 million pixels per second 
and bit-mapped 
text up to 25 thousand characters 
per second. 
This graphics processor 
supports 
advanced features such as hardware windows, zooming, panning, and scrolling. 
Intel Application 
Note AP-259 and Application 
Note AP-270 contain detailed information on 82786. 


When using the 82786, it may be necessary for the 80960KB processor to write to graphics memory. 
The interface design example illustrates how the 80960KB processor can transfer a 32-bit data word 
to the 16-bit data bus of the 82786. 


There are several ways to design an interface between the 82786 and the 80960KB processor. In this 
example, the 80960KB processor reads or writes to graphics memory by accessing the 82786 through 
the interface logic circuit. This example assumes that the 82786 operates in the slave mode, and that 
the 80960KB processor does not perform burst transfers. The 80960KB processor only performs 
burst transfers for instructions that specify accesses for more than one word or for instruction fetches. 


The interface circuit translates a 32-bit data bus to a 16-bit data bus by dividing the data lines into 
the upper and lower 16 bits and sequencing 
the data transmission. 
When the 80960KB processor 
writes to graphics memory, the bidirectional 
transceivers sequence the lower and the upper data bits 
of the L-bus to the 16-bit data bus of the 82786. 


The process is reversed when the 80960KB processor reads from graphics memory. 
The bidirec- 
tional transcc~ivers form a 32-bit data word by latching the first 16-bit data word on the lower data 
lines, routing the next 16 bits to the upper data lines, and then passing the 32-bit data word on the L- 
bus. 


Figure 51 shows the details of the graphics controller interface circuit. This interface uses the general 
system interface circuit plus the following logic units: the bidirectional 
transceivers, 
the data buffer 
control, the data bus controller, and the address translator. These logic blocks are highlighted by the 
shaded boxes. 


The bidirectional 
transceivers pass data to (from) a 32-bit data bus from (to) a 16-bit data bus. Data 
is sequenced through the transceivers by the control signals generated by the data buffer controller. 


The data buffer control logic generates 
the signals that operate and sequence 
the bidirectional 
transceivers. 
The direction signal for data flow through the transceivers 
is derived from the w!R 
signal of the 80960KB processor. The data buffer control logic generates four output enable signals: 
GABL enables the outputs on the B side for the lower 16 bits; GBAL enables the outputs on the A side 
for the lower 16bits; GABH enables the outputs on the B side for the higher 16bits; and GBAH enables 
the outputs on the A side for the higher 16 bits. These output enable signals are derived from the byte 
enable signals and are asserted when the slave enable signal (SEN) is activated by the 82786. 


The select lines for the bidirectional 
transceivers 
allow data to flow from either the latched data or 
the input pins. These lines, which are not shown, can be hardwired. 


The data bus controller provides the read (RD) and write (WR) commands, memory or I/O signal (M/ 
10), and a READY 0 signal. This circuit generates two read or write commands for every 32-bit data 
transfer to or from the 80960KB processor (one for each 16-bit data transfer). The data bus controller 
starts counting clock cycles when the 82786-CS and CYCLE-IN-PROGRESS 
signals are asserted. 
At the proper time (based upon clock counts), it asserts the read/write 
command. 
The data bus 
controller produces READ Yo after receiving the SEN signal from the 82786. 
READYo resets the 
count, and another read/write command is generated. 
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The address translator performs four functions: it converts the four byte enable signals to A, AI' and 
BHE; it increments AI after receiving READYo for the first 16-bit transfer; it generates the clock 
signal (CBAL) that latches the first 16-bit data word in the bidirectional 
transceivers 
when the 
80960KB processor performs a read operation; and it generates the READY signal for the CPU. 


Not shown is the cycle detector circuitthat generates the CYCLE-IN-PROGRESS 
signal. This signal 
can be generated by using the circuit similar to the one shown in Figure 44. The start of the cycle 
can be detected by gating the ADS and DEN signals. 
The end of the cycle can be indicated by 
READY. 


The interaction between the 82786 and the 80960KB processor is summarized 
in Figure 52. The 
operation is divided into two 16-bit data movements 
for both a read and write operation. 
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The 80960KB processor 
generates 
a memory-mapped 
address and data for the desired graphics 
memory location. 
It accesses the 82786 by triggering the interface circuit to generate the chip select 
signal and several operational signals: the read (RD) or write (WR) command, BHE, and the memory 
or I/O (MilO) signal. The 82786 begins the memory operation after it completes the current graphics 
processing activity. The 82786 acknowledges 
that it is performing a memory operation by asserting 
SEN. 


After the 82786 asserts SEN, it begins a l6-bit memory read or write operation by translating the 
address inputs (A21 through A) to a multiplexed DRAM address, and generating the DRAM control 
signals. 
Note that AI and A are derived from the byte enable signals. 


For a read operation, the data bus controller uses SEN to generate the READY signal. The assertion 
of READY causes the address translator to increment AI and to generate CBAL, which latches the 
lower 16 data bits on the B inputs of the bidirectional 
transceivers 
to the A side. 


Similarly, for a write operation, the data bus controller uses SEN to generate the READY signal. The 
assertion of READY causes the address translator to increment AI' The data buffer control uses SEN 
and the byte enable signals to produce GABL, which enable the outputs for the lower 16 data bits of 
the bidirectional 
transceivers. 


The 82786 then deasserts SEN and the transfer of the first 16 data bits is complete. 
To transfer the 
second 16 data bits, the interface circuit requests another memory operation by generating RD (or 
WR), BHE, and MilO (CS is already asserted). After it completes the current graphics processing 
activity, the 82786 begins the memory operation and asserts SEN. 


For a read operation, the data bus controller uses SEN to generate the READY signal. The data buffer 
control uses SEN to assert GBAH and GBAL, which enable the outputs for the higher and lower 16 
data bits. 


For a write operation, the data bus controller uses SEN to generate the READY signal. 
The data 
buffer control uses SEN and the byte enable signals to produce GABH, which enable the outputs for 
the higher 16 data bits of the bidirectional 
transceivers. 


The address translator generates READY for the 80960KB processor from the second READY to 
terminate the data transfer to the graphics memory. 


The 80960KB processor supports 8-bit, l6-bit, and 32-bit I/O interfaces. A general system interface 
circuit can be designed that connects to many slave-type peripherals. 
This interface can be expanded 
to accommodate 
a bus master peripheral or a 32-bit to l6-bit data bus translator. 
These interfaces 
were illustrated by four design examples. 
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The 80960KB processor marks the introduction 
of the 80960 architecture-a 
32-bit architecture 
from Intel. 
This architecture 
has been designed 
specifically 
to meet the needs of embedded 
applications 
such as machine control, robotics, processor control, avionics, and instrumentation. 
It 
represents a renewed commitment 
from Intel to provide reliable, high-performance 
processors and 
controllers for the embedded processor marketplace. 


The 80960 architecture 
can best be characterized 
as a high-performance 
computing 
engine. 
It 
features high-speed 
instruction 
execution 
and ease of programming. 
It is also easily extensible, 


allowing processors and controllers based on this architecture to be conveniently customized to meet 
the needs of specific processing and control applications. 


Some of the important attributes of the 80960 architecture 
include: 


full 32-bit registers 


high-speed, 
pipelined instruction environment 


a convenient program execution environment 
with 32 general-purpose 
registers and a versatible 
set of special-function 
registers 


a highly optimized procedure call mechanism 
that features on-chip caching of local variables 
and parameters 


extensive facilities for handling interrupts and faults 


extensive tracing facilities to support efficient program debugging and monitoring 


register scoreboarding 
and write buffering to permit efficient operation with lower performance 
memory subsystems 


The following 
sections 
describe 
those features 
of the 80960 architecute 
that are provided 
to 
streamline code execution and simplify programming. 
Also described are those features that allow 
extensions to be added to the architecture. 


Much of the design of the 80960 architecture 
has been aimed at maximizing 
the processor's 
computational 
and data processing speed through increased parallelism. 
The following paragraphs 
describe several of the mechanisms 
and techniques used to accomplish 
this goal, including: 


an efficient load and store memory-access 
model 


caching of code and procedural data 


overlapped 
execution of instructions 


many one or two clock instructions 
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One of the most important 
features of the 80960 architecture 
is that most of its operations 
are 
performed on operands in registers, rather than in memory. 
For example, all the arithmetic, logic, 
comparison, 
branching and bit operations are performed with registers and literals. 


This feature provides two benefits. 
First, it increases program execution speed by minimizing 
the 
number of memory accesses required to execute a program. 
Second, it reduces memory latency 
encountered 
when using slower, lower-cost memory parts. 


To support this concept, the architecture provides a generous supply of general-urpose 
registers. 
For 
each procedure, 32 registers are available (28 of which are available for general use). These registers 
are dividied into two types: global and local. Both these types of registers can be used for general 
storage of operands. The only difference is that global registers retain their contents across procedure 
boundaries, 
whereas the processor allocates a new set of local registers each time a new procedure 
is called. 


The architecture also provides a set of fast, versatile load and store instructions. 
These instructions 
allow burst transfers of 1,2,4,8,12 or 16 bytes of information between memory and the registers. 


To further reduce memory accesses, the architecture 
offers two mechanisms 
for caching code and 
data on chip: an instruction cache and multiple sets of local registers. 
The instruction cache allows 
prefetching of blocks of instruction from memory, which helps insure that the instruction execution 
pipeline is supplied with a steady stream of instructions. 
It also reduces the number of memory 
accesses reuq ired when performing 
iterative operations such as loops. (The size of the instruction 
cache can vary. With the 80960KB processor, it is 512 bytes.) 


To optimize the architecture's 
procedure 
call mechanism, 
the processor provides multiple sets of 
local registers. 
This allows the processor to perform most procedure calls without having to write 
the local registers out to the stack in memory. 


(The number 
of local-register 
sets provided 
depends 
on the processor 
implementation. 
The 
80960KB processor provides four sets of local registers.) 


Another technique 
that the 80960 architecture 
employs to enhance program execution 
speed is 
overlapping 
the execution 
of some instructions. 
This is accomplish 
through two mechanisms: 
register scoreboarding 
and branch prediction. 


Register scoreboarding 
permits instruction execution to ocntinue while data is being fetched from 
memory. 
When a load instruciton 
is executed, the processor 
sets one or more scoreboard 
bits to 
indicate the target registers to be loaded. After the target registers are loaded, the scoreboard bits are 
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instructions that do not use these registers. The processor uses the scoreboard bits to insure that target 
registers are not used until the loads are complete. 
(The checking of scoreboard bit is carried out 
transparently from software.) The net result of using this technique is that code can often be optimized 
in such a way as to allow some instructions to be executed in zero clock cycles (that is, executed for 
free). 


Conditional branch instructions commonly cause bottlenecks 
in the instruction execution pipeline, 
since the instruction decoder cannot decode instructions past the branch instruction until it knows the 
direction the branch is going to take. The 80960 architecture 
solves this problem with a technique 
called branch prediction. Branch prediction allows a programmer 
or compiler to select conditional 
branch instructions that indicate to the processor the direction a branch is likely to go. The decoder 
can then continue decoding instructions beyond the branch, even though the branch condition has 
not yet been tested. This technique eliminates waits between the decoder and execution unit, while 
branch conditions are being evaluated. 


Note 


The branch prediction 
mechanism 
is not implemented 
in the 80960KB 
series of processors. 


It is the intent of the 80960 architecture 
that a processor 
be able to execute 
commonly 
used 
instructions such as moves, adds, subtracts, logical operations, and branches in a minimum number 
of clock cycles (preferably one clock cycle). The architecture supports this concept in several ways. 
For example, the load and store model described earlier in this section (with its concentration 
on 
register-to-register 
operations) eliminates the clock cycles required to perform memory-to-memory 
operations. 


Also, all the instructions in the 80960 architecture are 32-bits long and aligned on 32-bit boundaries. 
This feature allows instructions to be decoded in one clock cycle. It also eliminates the need for an 
instruction-alignment 
stage in the pipeline. 


The design of the 80960KB processor 
takes full advantage 
of these features of the architecture, 
resulting in over 50 instructions that can be executed in a single clock-cycle. 


The 80960 architecture 
provides 
an efficient 
mechanism 
for servicing 
interrupts 
from external 
sources. To handle interrupts, the processor maintains an interrupt table of 248 interrupt vectors (240 
of which are available for general use). When an interrupt is signalled, the processor uses a pointer 
from the interrupt table to perform an implicit call to an interrupt handler procedure. 
In performing 
this call, the processor automatically 
saves the state of the processor prior to receiving the interrupt; 
performs the interrupt routine; and then restores the state of the processor. A separate interrupt stack 
is also provided to segregate interrupt handling from application programs. 


The interrupt 
handling facilities also feature a method of evaluating 
interrupts 
by priority. 
The 
processor 
is then able to store interrupt vectors that are lower in priority than the task that the 
processor is currently working on in a pending interrupt section of the interrupt table. 
At certain 
defined times, the processor checks the pending interrupts and services them. 


Partly as a side benefit of its streamlined 
execution environment 
and partly by design, processors 
based on the 80960 architecture are particularly easy to program. 
For example, the large number of 
general purpose registers allows relatively 
complex algorithms 
to be executed 
with a minimum 
number of memory accesses. 
The following paragraphs describe some of the other features for the 
architecture 
that simplify programming. 


The procedure 
call mechanism 
makes procedure calls and parameter passing between procedures 
simple and compact. 
Each time a call instruction 
is issued, the processor automatically 
saves the 
current set of local registers and allocates a new set of local registers for the called procedure. 
Likewise, on a return from a procedure, the current set of local registers is deallocated 
and the local 
registers for the procedure being returned to are restored. On a procedure call, the program thus never 
has to explicitly save and restore those local variables and paramters that are stored in local registers. 


The selection of instructions and addressing modes also simplifies programming. 
The architecture 


-offers a full set ofload, store, move, arithmetic, comparison and branch instructions, with operations 
on both integer and ordinal data types. 
It also provides a complete 
set of Boolean and bit-field 
instructions, 
to simplify operations on bits and bit strings. 


The addressing 
modes are efficient 
and straightforward, 
while at the same time providing 
the 
necessary indexing and scaling modes required to address complex arrays and record structures. 


The large 4-gigabyte address space provides ample room to store programs and data. The availability 
of 32 addressing 
lines allows some address 
lines to be memory-mapped 
to control 
hardware 
functions. 


To aid in program development, 
the 80960 architecture 
defines a wide selection of faults that the 
processor 
detects, including arithmetic faults, invalid operands, 
invalid operations, 
and machine 
faults. When a fault is detected, the processor makes an implicit call to a fault handler routine, using 
a mechanism similar to that described above for interrupts. 
The information collected for each fault 
allows program developers to quickly correct faulting code. It also allows automatic fault recovery 
from some faults. 
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To support 
debugging 
systems, 
the 80960 architecture 
provides 
a mechanism 
for monitoring 
processor activity by means of trace events. 
The processor can be configured to detect as many as 
seven different 'trace events, including the instruction 
execution, 
branch events, calls, supervisor 
calls, returns, prereturns, and breakpoints. 
When the processor detects a trace event, it signals a trace 
fault and calls a fault handler. Intel provides several tools that use this feature, including an in-circuit 
emulator (ICE) device. 


The 80960 architecture 
described earlier in this chapter provides a high-performance 
computing 
engine for use as the computational 
and data processing core of embedded processors or controllers. 
The architecture 
also provides several features that enable processors based on this architecture 
to 
be easily customized to meet the needs of specific embedded applications, such as signal processing, 
array processing, 
or graphics processing. 


The most important of these features is a set of 32 special function registers. These registers provide 
a convenient 
interface to circuitry in the processor 
or to pins that can be connected 
to external 
hardware. 
They can be used to control timers, to perform operations 
on special data types, or to 
perform I/O functions. 


The special function registers are similar to the global registers. 
They can be addressed by all the 
register-access 
instructions. 


The 80960K series of processor offer a complete implementation 
of the 80960 architecture, 
plus 
several extensions 
to the architecture. 
These extensions 
fall into two categories: 
floating-point 
processoring 
and interagent communication. 


The 80960KB 
processor 
provides 
a complete 
implementation 
of the IEEE standard 
of binary 
floating-point 
arithmetic (IEEE 754-185). 
This implementation 
includes a full set of floating-point 
operations, 
including 
add, subtract, 
multiply, 
divide, trigonometric 
functions, 
and logarithmic 
functions. 
These operations are performed on single precision (32-bit), double precision (64-bit), 
and extended precision (80-bit) real numbers. 


One of the benefits of this implementation 
is that the floating-point 
handling facilities are completely 
integrated into the normal instruction execution environment. 
Single- and double-precision 
floating- 
point values are stored in the same registers as non-floating point values. The four, 80-bit floating- 
point registers are provided to hold extended-precision 
values. 


All of the processors in the 80960K series provide an interagent communication 
(lAC) mechanism, 


which allows agents connected 
to the processor's 
bus to communicate 
with one another. 
This 


mechanism 
operates similarly to the interrupt mechanism, 
except that lAC messages 
are passed 
through dedicated sections of memory. The sort of tasks handled with lAC messages are processor 
reinitialization, 
stopping the processor, purging the instruction cache, and forcing the processor to 
check pending interrupts. 


As has been shown in the preceding discussion, the 80960 architecture offers lots of possibilities and 
lots of room to grow. The first implementation 
of this architecture (the 80960KB processor) provides 


average instruction processing rates of 7.5 million instructions per second (7.5 MIPS) at 20 MHz 
clock rate and 10 MIPS at a 25 MHz clock rate. IThis performance 
places the 80960 KB at the top 
of the performance 
range for advanced, VLSI processor architectures. 


However, 
the 80960KB 
is only the beginning. 
With improvements 
in VLSI technology, 
future 


implementation 
of this architecture will offer even greater performance. 
They will also offer a variety 


of useful extensions 
to solve specific control and monitoring 
needs in the field of embedded 


applications. 


This section describes how the 80960KB processor stores and executes instructions and how it stores 
and manipulates data. The parts of the execution environment that are discussed include the address 
space, the register model, the instruction 
pointer, and the arithmetic 
controls. 
The execution 


environment's 
procedure stack and procedure-call 
mechanism 
are described in section 3. 


When the 80960KB processor 
is initialized, 
it sets up an execution environment. 
It then begins 


executing instructions 
from a program, using this execution environment 
to store and manipulate 


data. 


Figure I shows the part of the execution environment that the processor sets up to execute a procedure 
within a program. 
This environment 
consists of 232-byte address space, a set of global and floating- 


point registers. a set of local registers, a set of arithmetic-control 
bits, the instruction pointer, a set 
of process-control 
bits, and a set of trace-controls 
bits. All of these items, except the address space, 


reside on the 80960KB chip. 


Note 


The floating-point 
registers 
shown in Figure 
I are not defined 
in the 80960 architecture. 
They are 


extensions 
to the architecture 
that have been added to the 80960KB 
processor 
to support floating-point 


operations 
on the extended-real 
(floating 
point) data type. 
(The 80960KA 
processor 
does not provide 


floating-point 
registers.) 


The 32 special-function 
registers 
(shown 
in Figure 
I in a dashed box) are defined 
in the 80960 


architecture. 
These registers 
are not implemented 
in the 80960KB 
and 80960KA 
processors. 


When the instruction 
stream includes 
a procedure 
call, a procedure 
stack and some additional 


elements are added to this execution environment. 
These procedure-call 
related elements are shown 
and discussed in Section 3. 


NOTES: 


1. REGISTER 
9151S 
RESERVED 
FOR STACK 
MANAGEMENT 
FUNCTIONS. 


2. REGISTERS 
rO, r1, AND 
r2 ARE 
RESERVED 
FOR STACK 
MANAGEMENT 
FUNCTIONS. 
3. SPECIAL 
FUNCTION 
REGISTERS 
ARE 
NOT 
IMPLEMENTED 
IN THIS 
PROCESSOR. 
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From the point of view of the processor, the address space is flat (unsegmented) 
and byte addressable, 


with addresses running contiguously 
from 0 to 232_1. Programs and the kernel can allocate space for 
data, instructions, 
and the stack anywhere within this space, with the following exceptions: 


Instructions 
must be aligned on word boundaries. 


Some of the addresses 
in the upper 16M bytes of the address space (addresses 
FFOOOOOOl6 


through FFFFFFFFI6) 
are reserved for specific functions. 
In general, programs and the kernel 
should not use this section of the address space. 


The memory requirements 
to support this address space are given in Section 6 in the section titled 
"Memory Requirements". 


The processor provides three types of data registers: 
global, floating-point, 
and local. The 16 global 
registers constitute a set of general-purpose 
registers, the contents of which are preserved 
across 


procedure boundaries. 
The 4 floating-point 
registers are provided to support extended floating-point 
arithmetic. 
Their contents are also preserved across procedure boundaries. 
The 16 local registers 
are provided to hold parameters specific to a procedure (i.e. local variables). 
For each procedure that 
is called, the processor allocates a separate set of 16 local registers. 


For anyone 
procedure within a program, 36 registers are thus available (as shown in Figure 2): the 
global registers, the 4 floating-point 
registers, and the 16 local registers. 
All of these registers are 
maintained 
on the processor chip. 


The 16 global registers (gOthrough g15) are 32-bit registers. 
Each register can thus hold a word (32 
bits) of data. Registers gOthrough gl4 are general-purpose 
registers; gl5 is reserved for the current 
frame pointer (FP). The FP contains the address of the first byte in the current (topmost) stack frame. 
(The FP and the procedure stack are discussed in detail in Section 3). 


The general-purpose 
global registers 
(gO through g14) can hold any of the data types that the 
processor recognizes 
(i.e. ordinals, integers, reals). 


The four floating point registers (fpOthrough fp3) are 80-bit registers. These registers can be accessed 
only as operands of floating-point 
instructions. 
All numbers stored in these registers are stored in 
extended-real format. (This format is described in section 11). The processor automatically converts 
floating-point 
values from real or long-real format into extended-real 
format when a floating-point 
register is used as a destination 
for an instruction. 


Note 


The floating-point 
registers 
are defined 
in the 80960 architecture 
as an option for processors 
such as the 
80960KB 
that support 
floating-point 
operations. 
These registers 
may be omitted 
from implementations 
of 


the architecture 
that do not support 
floating-point 
operations. 
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The 16 local registers 
(1'0 through rl5) are 32-bit registers, like the global registers. The purpose of 
the local registers is to provide a separate set of registers, aside from the global and floating-point 
registers, for each active procedure. 
Each time a procedure is called, the processor automatically 
sets 
up a new set oflocal registers for that procedure and saves the local registers for the calling procedure. 
The program does not have to explicitly save and restore these registers. 


Local registers r3 through r15 are general-purpose 
registers. Register 1'0 through r2 are reserved for 
special functions, as follows: register 1'0 contains the previous frame pointer (PFP); rl contains the 
stack pointer (SP); and r2 contains the return instruction pointer (RIP). (The PFP, SP, and RIP are 
discussed in detail in Section 3). The processor accesses the local registers at the same speed as it 
does the global registers. 


Several of the processor's 
instructions operate on multiple-word 
operands. For example, the load- 
long instruction (ldl) loads two words from memory into two consecutive registers. Here, the register 
number for the least significant word is specified in the instruction and the most significant word is 
automatically 
loaded into the next higher numbered register. 


In cases where an instruction 
specifies a register number and multiple, consecutive 
registers are 
implied, the register number must be even if two registers are accessed (e.g. gO, g2) and an integral 
multiple of four if three or four registers are access (e.g. gO, g4). If a register reference for a source 
value is not properly aligned, the value is undefined. If a register reference for a destination value is 
not properly aligned, the registers that the processor writes to are undefined. 


The 80960KB provides a mechanism called register scoreboarding 
that in certain situations permits 
instrucitons to be executed concurrently. 
This mechanism 
works as follows. 
While an instruction 
is being executed, the processor sets a scoreboard bit to indicate that a particular register or group 
of registers is being used in an operation. If the instructions that follow do not use registers in that 
group, the processor in some instances is able to execute those instructions before execution of the 
prior instruction is complete. In effect, the register scoreboarding 
mechanism 
allows some instruc- 
tions to be executed for free (zero clock cycles). 


A common application of this feature is to execute one or more fast instructions (instructions that take 
one to three clock cycles) concurrently 
with load instructions. 
A load instruction typically takes 3 
to 9 clock cycles (depending on the design of system memory). Register scoreboarding 
allows other 
instructions to be executed concurrently with the load instruction, provided that the other instructions 
do not affect the registers being loaded. For example, the following group of instructions load a group 
of local registers while performing 
some other operations on data in global registers. 


Id xyz, 
r6 
addi 
g4, 
g6, 
g7 
addi 
g9, 
glO, 
gll 
Id abc, 
r8 
and 
gO, 
Oxffff, 
gl 
addi 
r6, 
r8, 
r7 


# r6 ~ 
data 
from 
address 
xyz 
# 
g7 ~ 
g4 + g6 
# 
gll 
~ 
g9 
+ glO 
# r6 ~ 
data 
from 
address 
abe 
# gl 
~ 
gO AND 
Oxffff 
# 
r7 ~ 
r6 + r8 


Here, the two addi instructions following the first load and the and instruction following the second 
load are performed for free. 


The other situation where scoreboarding 
can be useful for procedure optimization 
is when floating- 
point instructions are being executed. 
Floating-point 
operations are handled by a separate execution 
unit in the processor. So, non-floating 
point instructions 
can often be executed concurrently 
with 
floating-point 
instructions, 
providing 
that they do not use the same registers and do not use the 
arithmetic-logic 
unit (ALU). 
(A detailed description 
of the register-scoreboarding 
mechanism 
is 
given in Appendix C.) 


The instruction pointer (lP) is the address (in the address space) of the instruction currently being 
executed. 
This address is 32 bits; however, since instructions 
are required to be aligned on word 
boundaries 
in memory, the 2 least significant bits of the IP are always zero. 


Instructions in the processor are one or two words long. The IP gives the address of the lowest order 
byte of the first word of the instruction. 


The IP is stored in the processor and cannot be read directly. 
However, the IP-with-displacement 
addressing mode allows the IP to be used as an offset into the address space. This addressing mode 
can also be used with the Ida (load address) instruction to read the current value of the IP. 


When a break occurs in the instruction stream (due to an interrupt or a procedure call), the IP of the 
next instruction to be executed (i.e. the RIP) is stored in local register r2, which is then stored on the 
stack. Refer to Section 3 for further discussion of this operation. 


The processor's 
arithmetic controls are made up of a set of32 bits, which are cached on the processor 
chip in the arithmetic-controls 
register. 
Figure 3 shows the arrangement 
of the arithmetic controls 
bit. The arithmetic controls bits include condition code bits; floating-point 
control and status bits; 
integer control and status bits; and a bit that controls faulting on imprecise faults. 
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The processor 
sets or clears these bits to show the results of certain operations. 
For example, the 
processor modifies the condition code bits after each comparison operation to show the result of the 
comparison. 
Other arithmetic 
control bits, such as the floating-point 
fault masks, are set by the 
currently running program to tell the processor how to respond to certain fault conditions .. 


Note 


The arithmetic 
status flags and the floating-point 
flags and masks are not defined 
in the 80960 
architecture. 
They are an extension 
of the architecture, 
which is provided 
in the 80960KB 
processor 
to 
support 
floating-point 
operations. 
For implementations 
of the architecture 
that do not support 
floating- 


point operations, 
these flags and masks are reserved 
bits. 


The state of the processor's 
arithmetic 
controls 
is undefined 
at processor 
initialization 
or on a 
processor reinitilize (initiated with a reinitialize processor lAC). Partofthe 
initialization code should 
thus be to set the arithmetic controls to a specific state. 


The arithmetic controls can be examined and modified using the modify AC (modac) 
instruction. 
THis instruction uses a mask to allow specific bits to be checked and changes. 


The processor automatically 
saves and restores the arithmetic controls when it services an interrupt 
or handles a fault. Here, the processor saves the current state of the arithmetic controls in an interrupt 
record or fault record, then restores the arithmetic controls upon returning from the interrupt or fault 
handler, respectively. 


The modac instruction 
can be used to explicitly 
save and restore the contents of the arithmetic 
controls. 


Note 


In the following 
discussion, 
some of the arithmetic 
control 
bits are refeffered 
to as "sticky 
flags". 
A sticky 


flag is one that the processor 
never implicitly 
clears. Once the processor 
sets a sticky flag to indicate 
that a 


particular 
condition 
has occurred, 
the flag remains 
set until the program 
explicitly 
clears it. 


The processor sets the condition code flags (bits 0-2) to indicate the results of certain instructions 
(usually 
comapred 
instructions). 
Other 
instructions, 
such as conditional-branch 
instructions, 


examine these flags and perform functions according to their state. Once the processor has set these 
flags, it leaves them unchanged 
until it executes another instruction 
that uses these flags to store 
results. 


These flags are used to show either true or false conditions or inequalities (greater-than, equal, or less- 
than conditions). 
To show true or false conditions, 
the flags are set as shown in Table 1. 


Condition 
Condition 
Code 


010 
true 


000 
false 


Condition 
Condition 
Code 


000 
unordered 


001 
greater than 


010 
equal 


100 
less than 


The terms ordered and unordered are used when comparing floating-point 
numbers. 
If, when com- 
paring two floating-point 
values, one of the value is a NaN (not a number), the relationship 
is said 
to be "unordered". 
Reference to the portion of Section 11 entitled "Comparison 
and Classification" 
for further information 
about the ordered and unordered conditions. 


The processor uses the arithmetic status fields (bits 3-6) in conjunction with the classify instructions 
(classr and classrl) to show the class of a floating-point 
number. When executing these instructions, 
the processor sets the arithmetic status bits as shown in Table 3, according to the class of the value 
being classified. 


Arithmetic 
Classification 
Status 


sOOO 
zero 


sool 
denormalized 
number 


sOlO 
normal finite number 


sOlI 
infinity 


sloo 
quiet NaN 


s101 
signaling NaN 


sIlO 
reserved operand 


The integer overflow mask (bit 12) and the integer overflow flag (bit 8) are used in conjunction with 
the arithmetic integer-overflow 
fault. The mask bit masks the integer-overflow 
fault. When the fault 
is masked, the processor 
sets the integer overflow flag whenever an integer or decimal overflow 
occurs, to indicate that the fault condition has occurred even though the fault has been masked. 
If 
the fault is not masked, the fault is allowed to occur and the flag is not set. The integer overflow flag 
is a sticky flag. (Refer to the discussion of the arithmetic integer-overflow 
fault in Section 8 for more 
information 
about the integer overflow mask and flag.) 


The no imprecise faults flag (bit 15) determines 
whether or not imprecise faults are allowed to be 
raised. If set, faults are required to be precise; if clear, certain faults can be imprecise. 
(Refer to the 
portion of Section 8 titled "Precise and Imprecise Faults" for more information 
about this flag.) 


The floating-point 
flags (bits 16 through 20) and masks (bits 24 through 28) perform the same 
functions as the integer overflow flag and mask, except they are used for operations on real (floating- 
point) numbers. 
When a mask bit is set, its associated floating-point 
fault is masked. 
If a mask bit 
is set, the processor sets the flag for the associate fault whenever the fault condition occurs. All the 
floating-point 
flag bits are sticky bits. Refer to the portion of Section 11titled "Exceptions 
and Fault 
Handling" for a detailed discussion ofthe floating-point faults and their associated flag and mask bits 
in the arithmetic controls. 


The floating-point normalizing mode flag (bit 29) determines where ornot floating-point 
instructions 
are allowed to operate on denormalized 
numbers. 
If set, floating-point 
instructions 
are allowed to 
operate on denormalized 
numbers; if clear, the processor generates a floating reserved-operand 
fault 
when it detects denormalized 
numbers that are used as operantds 
for floating-point 
instructions. 


(Refer to "Normalizing 
Mode" in section 11 for more information 
on the use of this flag.) 


The floating-point 
rounding control fields (bits 31-30) indicates which rounding mode is in effect for 
floating point computations. 
These bits are set as shown in Table 4, depending on the rounding mode 
to be selected. 


Rounding 
Rounding Mode 
Control 


00 
round to nearest (even) 


01 
Round down (toward negative infinity) 


10 
Round up (toward positive infinity) 


11 
Truncate (round toward zero) 


(Refer to "Rounding 
Control" in Section 11 for more information 
on the use of the floating-point 
rounding control bits.) 
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The processor's 
process controls and trace controls are also cached on the processor 
chip. 
The 
processor controls are a set of32 bits that control or show the current execution state ofthe processor. 
The process controls are described in detail in Section 6. 


The trace controls are a set of 32 bits that control the tracing facilities of the processor. 
The trace 
controls are described in Section 10. 


The processor provides a 512-byte cache for instructions. 
When the processor fetches an instruction 
or group of instructions 
from memory, they are stored in this cache before being fed into the 
instruction-execution 
pipeline. 
The processor manages this cache transparently 
from the program 
being run. 


This instruction cache is a read-only cache, meaning that once bytes from the instruction stream are 
written into the instruction cache, they cannot be changed. 
Because of this, the processor does not 
support self-modified 
programs in a transparent 
fashion. The only way to change the instruction 
stream once it has been written into the instruction cache is to purge the instruction cache. The lAC 
message "purge instruction cache" is provided for this purpose, as described in Section 12. 


Note 


The purge instruction 
cache lAC is not defined 
in the 80960 architecture. 
It is an implementation- 
dependent 
feature of the 80960KB 
processor. 


This section describes 
the 80960KB 
processor's 
procedure 
call and stack mechanism. 
It also 
describes the supervisor call mechanism, 
which provides a means of calling privileged procedure 


such as kernel services. 


The processor supports three types of procedure calls: 


Local call 


System call 


Branch and link 


A local call uses the processor's 
call/return mechanism, 
in which a new set of local registers and a 
new frame on the stack are allocated for the called procedure. A system call is similar to a local call, 
however, it provides access to procedures through a system procedure table. The most important use 
of a system call is to call privileged 
procedures 
called supervisor 
procedures. 
A system call to a 


supervisor 
procedure 
is called a supervisor 
call. 
A branch and link is merely a branch to a new 


instruction with the return IP stored in a global register. 


In this section, the call/return mechanism is introduced first and is followed by a discussion of how 
this mechanism 
is used to make local calls and system calls. 


Note 


The processor's 
interrupt- 
and fault-handling 
mechanisms 
are implicit 
procedure 
calls. 
These implicit 


calls are described 
in detail in Sections 
7 and 8. respectively. 


The processor's 
call/return mechanism has been designed to simplify procedure calls and to provide 
a flexible method for storing and handling variables that are local to a procedure. 


Two structures support this mechanism: the local registers (on the processor chip) and the procedure 
stack (in memory). 
Figure 4 shows the relationship 
of the local registers to the procedure stack. 
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For each procedure, the processor automatically 
allocates a set of local registers and a frame on the 
procedure 
stack. Since the local registers are on-chip, they provide fast-access 
storage for local 
variables. If additional space for local variables is required, it can be allocated in stack frame. 


When a procedure call is made, the processor automatically 
saves the contents of the local registers 
and the stack frame for the calling procedure and sets up a new set of local registers and a new stack 
frame for the called procedure. 


This procedure 
call mechanism 
provides two benefits. First, it provides a structure for storing a 
virtually unlimited number of local variables for each procedure: the on-chip local registers provide 
quick access to often-used variables and the stack provides space for additional variables. 


Second, a program does not have to explicitly 
save and restore the variables 
stored in the local 
registers and stack frames. The processor does this implicitly on procedure calls and on returns. 


For each procedure, the processor allocates a set of 16 local registers. Three of these registers (r 1, 
r2 and r3) are reserved for linkage information to the procedures 
together. 
The remaining 
13 local 
registers are available for general storage of variables. 


The processor maintains a procedure stack in memory for use when performing local calls. This stack 
can be located anywhere in the address space and grows from low addresses to high addresses. 


The stack consists of continguous frames, one frame for each active procedure. As shown in Figure 
5, each stack frame provides a save area for the local registers and an optional area for additional 
variables. 


To increase the speed of procedure calls, the 80960KB processor provides four sets of local registers. 
Thus, when a procedure call is made, the contents of the current set oflocal registers often do not have 
to be stored in the procedure 
stack. 
Instead, a new set of local registers is assigned to the called 
procedure. When procedure calls are made greater than four deep, the processor automatically 
stores 
the contents of the oldest set of local registers on the stack to free up a set of local registers for the 
most recently called procedure. 


Refer to the subsection "Mapping the Local Registers to the Procedure Stack" for further discussion 
of the relationship 
between the local register sets and the procedure stack. 


Global register g 15 (FP) and local registers TO (PFP), r 1(SP) and r2 (RIP) contain information to link 
procedures together and to link the local registers to the procedure stack. The following paragraphs 
describe this linkage information. 


The FP is the address of the first byte of the current (topmost) stack frame. 
On procedure calls, the 
FP for the new frame is stroed in global register g15; on returns, the FP for the previous frame is 
restored in g15. 
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The 80960KB processor aligns each new stack frame on a 64-byte boundary. Since the resulting FP 
always points to a 64-byte boundary, 
the processor 
ignores the 6 low-order 
bits of the FP and 
interprets them to be zero. 
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Note 


The alignment 
boundary 
for new frames is defined 
by means of an implementation-dependent 
parameter 
called SALIGN. 
The relationship 
of SALIGN 
10 the frame alignment 
boundary 
is described 
in Appendix 
E. 


The procedure stack grows upward (i.e. toward higher addresses). The SPpoints to the next available 
byte of the stack frame, which can also be thought of as the last byte of the stack frame plus one. To 
determine the initial SP value, the processor adds 64 to the FP. 


If additional space is needed on the stack fqr local variables, the SP may be incremented in one-byte 
increments. For example, the following instruction adds six words of additional space to the stack: 


When the processor creates a new frame on a procedure call, it will, if necessary, add a padding area 
to the stack so that the new frame starts on a 64 byte boundary. To create the padding area, the 
processor rounds off the SP for the current stack frame (the value in r1) to the next highest 64 byte 
boundary. This value becomes the FP for the new stack frame. 


The PFP is the address of the first byte of the previous stack frame. Since the 80960KB ignores the 
6 low-order bits of the FP, only the 26 most-significant 
bits of the PFP are stored here. The 4 least- 
significant bits of rO are then used to store return status information. 


Bits 0 through 2 of local register rOcontain return status information for the calling procedure and 
bit 3 contains the prereturn-trace 
flag. When a procedure call is made (either explicit or implicit), the 


processor records the call type in the return status field. The processor then uses this information to 
select the proper return mechanism 
when returning to the calling procedure. 


Table 5 shows the encoding of the return status field according to the different types of calls that the 
processor supports. Of the five types of calls allowed, the fault call (described in Section 8) and the 
interrupt and stopped-interrupt 
calls (described 
in Section 7) are implicit calls that the processor 
initiates. The local call (described in this section) is an explicit call that a program initiates using the 
call or calIx instruction. The supervisor call (described at the end of this section in the portion "User- 
Supervisor Protection Model") is an explicit call that a program makes using the calls instruction. 
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Encoding 
Call Type 
Return 
Action 


000 
Local call or supervisor call made 
Local return 
from the supervisor mode 


001 
Fault call 
Fault return 


OlD 
Supervisor call from user mode, 
Supervisor return, with the trace 
trace was disabled before call 
enable flag in the process controls 
set to 0 and the execution mode 
flag set to 0 


011 
Supervisor call from user mode, 
Supervisor return, with the trace 
trace was enabled before call 
enable flag in the process controls 
set to 1 and the execution mode 
flag set to 0 


100 
reserved 


101 
reserved 


110 
Stopped-interrupt 
call 
Stopped-interrupt 
return 


111 
Interrupt call 
Interrupt return 


The third column of Table 5 shows the type of a return action that the processor takes depending on 
the state of the return status field. 


The processor records two versions of the supervisor call: one for when the trace-enable flag in the 
process controls is set prior to a supervisor call and one for when the flag is clear prior to the call. 
The trace controls are described in detail in Section 9. 


The prereturn-trace 
flag is used in conjunction 
with the call-trace and prereturn-trace 
modes. If the 
call-trace mode is enabled when a call is made, the processor sets the prereturn-trace 
flag; otherwise 
it clears the flag. Then, if this flag is set and the prereturn-trace 
mode is enabled, a prereturn trace 
event is generated on a return before any actions associated with the return operation are performed. 
Refer to Section 9 for a detailed discussion of the interaction of the call-trace and prereturn-trace 
modes and the prereturn-trace 
flag. 


The RIP is the address of the instruction 
that the processor 
is to execute after returning 
from a 
procedure call. This instruction is the instruction that follows the procedure call instruction. 


Since the processor 
uses the same procedure 
call mechanism 
to make implicit procedure 
calls to 
service faults and interrupts, programs should not use register r2 for purposes other than to hold the 
RIP. 


The availability of multiple register sets cached on the processor chip and the saving and restoring 
ofthese register sets in stack frames should be transparent to most programs. However, the following 
additional information about how the local registers and procedures stack are mapped to one another 
can help avoid problems. 


Since the local-register 
sets reside on the processor chip, the processor will often not have to access 


the stack frame in the procedure 
stack, even though space has been allocated on the stack for the 
current frame. The processor only accesses the current frame in the procedure stack in the following 
instances: 


1. 
to read or write variables other than those held in the local registers, or 


2. 
to read local registers that were stored in the procedure stack due to the nesting of procedures 
calls more than four deep. 


This method of mapping the local registers to the register-save 
areas in the procedure 
stack has 
several implications. 
First, storing information 
in a lcoal register does not guarantee that it will be 


stored in its associated word in the current stack frame. Likewise, storing information in the first 16 
words of a stack frame does not guarantee that the local registers associated with the stack frame are 
modified. 


Second, if you try to read the contents of the current set of local registers through a memory access 
to the first 16 words of the current stack frame, you may not get the expected result. This is also true 
if you try to read the contents of a previously stored set of local registers through a memory address 
to its associated stack frame. 


The processor automatically 
stores the contents of a local register set into the register-save 
area of 
its associated stack frame only if the nesting of procedure calls (local or supervisor) 
is deeper than 
the number of local register sets. 


Occasionally, it is necessary to have the contents of all local registers sets match the contents of the 
register-save 
areas in their associated stack frames. For example, when debugging software it may 
be necessary to trace the call history back through the nested procedures. This can not be done unless 
the cached local-register 
frames are flushed (i.e., written out to the procedure stack). 


The processor provides the flushreg (flush local registers) instruction to allow vo!unatry flushing of 
the local registers. This instruction causes the contents of all the local-register sets, except the current 
set, to be written to their associated stack frames in memory. 


Third, if you need to modify the previous FP in register rO, you should precede this operation with 
the flushreg instruction, 
or else the behavior of the ret (return) instruction is not predictable. 


Fourth, local registers should not be used for passing parameters between procedures. 
(Parameters 


passing is discussed in the following subsection.) 
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Fifth, when a set of local registers is assigned to a new procedure, 
the processor may not clear or 
initialize these registers. The initial contents of these registers are therefore unpredictable. 
Also, the 
processor 
does not initialize the local register-save 
area in the newly created stack frame for the 
procedure, so its contents are equally unpredictable. 


A local call is made using either of two local call instructions: 
call and calix. These instructions 
initiate a procedure call using the call/return mechanism 
described earlier in this section. 


The call instruction 
specifies the address of the called procedures 
as the IP plus a signed, 24-bit 
displacement 
(Le.-223 to 223_4). 


The calix instruction allows any of the addressing modes to be used to specify the procedure address. 
The IP with displacement 
addressing mode allows full 32-bit IP relative addressing. 


During a local call, the processor performs the following operations: 


1. 
Stores the RIP in current local-register 
r2. 


2. 
Allocates a new set of local registers for the called procedure. 


3. 
Allocates a new frame on the procedure stack. 


4. 
Changes the instruction pointer to point to the first instruction in the called procedure. 


5. 
Stores the PFP in new local-register 
rOo 


6. 
Stores the FP for the new frame in global register g15. 


7. 
Allocates a save area for the new local registers in the new stack frame. 


8. 
Stores the SP in new local-register 
r1. 


On a return, the processor performs these operations: 


1. 
Sets the FP in global register g15 to the value of the PFP in current local-register 
rOo 


2. 
Deallocates the current local registers for the procedure that initiated the return and switches to 
the local registers assigned to the procedure being returned to. 


3. 
Deallocates 
the stack frame for the procedure that initiated the return. 


4. 
Sets the IP to the value of the RIP in new local-register 
r2. 


The algorithms that the call, calix, and ret instructions use are described in greater detail in Section 
10. 


The processor 
supports 
two mechanisms 
for passing 
parameters 
between 
procedures: 
global 
registers and argument list. 


The global registers provide the fastest method of passing parameters. 
Here, the calling procedure 
copies the parameters 
to be passed into global registers. 
The called procedure 
then copies the 
parameters 
(if necessary) out of the global registers after the call. 


On a return, the called procedure can copy result parameters into global registers prior to the return, 
with the calling procedure copying them out of the global registers after the return. 


When more parameters need to be passed than will fit in the global registers, they can be placed in 
an argument list. This argument list can be stored anywhere in memory providing that the procedure 
being called has a pointer to the list. Commonly, a pointer to the argument list is placed in a global 
register. 


Parameters 
can also be returned to the calling procedure 
through an argument list. Here again, a 
pointer to the argument is generally returned to the calling procedure through a global register. 


The argument list method of passing parameters should be thought of as an escape mechanism 
and 
used only when there are not enough global registers available for passing parameters. 


A convenient place to store an argument list is in the stack frame for the calling procedure. Storing 
the argument list in the stack provides the benefit of having the list automatically 
deallocated 
upon 
returning from the procedure that set up the list. Space for the argument list is created by incrementing 
the SP, as described earlier in this chapter in the section titled "Stack Pointer". 


Parameters 
can also be returned to the calling procedure 
through an argument 
list in the stack. 
However, care should be taken when doing this. The return argument list must not be placed in the 
frame for the called procedure, since this frame is deallocated 
on the return. Also, if the return list 
is to be placed in the frame of the calling procedure, the calling procedure must allocate space for this 
list prior to making the call. 


A system call is made using the call system instruction calls. This call is similar to a local call except 
that the processor 
gets the IP for the called procedure 
from a data structure called the system 
procedure table. (System calls are sometimes referred to in this chapter as "system procedure-table 
calls".) 


Figure 6 illustrates 
the use of the system procedure 
table in a system call. The calls instruction 
requires a procedure-number 
operand. This procedure 
number provides an index into the system 
procedure 
table, which contains IPs for specific procedures. 
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The system call mechanism 
supports two types of procedure calls: local calls and supervisor calls. 
A local call is the same as that made with the call and callx instructions, except that the processor gets 
the IP of the called procedure from the system procedure table. The supervisor call differs from the 
local call in two ways: (I) it causes the processor to switch to another stack (called the supervisor 
stack), and (2) it causes the processor to switch to a different execution mode. 


The system call mechanism offers two benefits. First, it supports portability for application software. 
System calls are commonly used to call kernel services. By calling these services with a procedure 
number rather than a specific IP, applications 
software does not have to be changed each time the 
implementation 
of the kernel services is modified. 


inter 


Second, the ability to switch to a different execution mode and stack allows kernel procedures and 
data to be insulated from applications 
code. This benefit is described in more detail later in "User- 
Supervisor-Protection-Model" 
later in this chapter. 


The system procedure table is a general structure, which the processor uses in two ways. The first 
way is as a place for storing IPs for kernel procedures, which can then be accessed through the system 
call mechanism. The processor gets a pointer to the system procedure table from the initial memory 
image (IMI) as described in Section 6, "System Data-Structure 
Pointers". 


The second way a system procedure 
table is used is as a place for storing IPs for fault handler 
procedures. Here, the processor gets a pointer to the system procedure table from entries in the fault 
table, as described in Section 8, "Fault-Table 
Entires". 


The structure of the system procedure table is shown in Figure 7. The following paragraphs describe 
the fields in this table. 


The procedure 
entires specify the target IPs for the procedures 
that can be accessed through the 
system procedure table. Each entry is made up of an address (or IP) field and a type field. The address 
field gives the address of the first instruction of the target procedure. Since all instructions are word 
aligned, only the 30 most-significant 
bits of the address are given. The processor 
automatically 
provides zeros for the least-significant 
bits. 


The procedure entry type field indicates the type of call to execute: local or supervisor. The encodings 
of this field are shown in Table 6. 


Entry Type 
Procedure 
Type 
Field 


00 
local procedure 


01 
reserved 


10 
supervisor procedure 


11 
reserved 
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When a supervisor call is made, the processor switches to a new stack called the supervisor stack. 
The processor gets a pointer to this stack from the supervisor-stack -pointer entry (bytes 12-15, bits 
2-31) in the system procedure table. Since stack frames are word aligned, only the 30 most-significant 
bits of the supervisor stack pointer are given. 


The trace-control 
flag (byte 12, bit 0) specifies the new value of the trace-enable 
flag when a 
supervisor call causes a switch from user mode to supervisor mode. The use of this bit is described 
in Section 9. 


When a calls instruction references a procedure entry designed as a local type (002)' the processor 
executes a local call to the procedure selected from the system procedure table. Neither a mode switch 
nor a stack switch occurs. 


The ret instruction permits returns from either a local procedure or a supervisor procedure. The return 
status field in local register rOdetermines the type of return action that the processor is to take. If the 
return status field is set to 000" a local return is executed. In a local return, no stack or mode switching 
is carried out. 
- 


The processor provides a mode and stack switching mechanism called the user-supervisor 
protection 
model. This protection model allows a system to be designed in which kernel code and data reside 
in the same address space as user code and data, but access to the kernel procedures (called supervisor 
procedures) 
is only allowed through a tightly controlled interface. This interface is provided by the 
system procedure table. 


The user-supervisor 
protection model also allows kernel procedures to be executed using a different 
stack (the superviosr stack) than is used to execute applications program procedures. The ability to 
switch stacks helps maintain 
the integrity 
of the kernel. For example, 
it would allow system 
debugging software or a system monitor to be accessed, even if an applications 
program crashes. 


When using the user-supervisor 
protection model, the processor can be in eithr of two execution 
modes: user or supervisor. The difference 
between the two modes in that when in the supervisor 
mode, the processor 


switches to the supervisor stack, and 


may execute a set of supervisor only instructions. 


Note 


In the 80960KB 
implementation 
of the 80960 architecture. 
the only supervisor-only 
instruction 
is the 
modify 
process 
control 
instruction 
(mod pc). 


Mode switching 
between 
the user and supervisor 
execution 
modes is accomplished 
through 
a 
supervisor 
call. 
A supervisor 
call is a call executed 
with the calls instruction 
that references 
a 
supervisor procedure in the system procedure table (i.e. a procedure with an entry type 1°2), 


When the processor is in the user mode and it executes a calls instruction, the processor performs the 
following actions: 


intel 


It switches to supervisor mode 


It switches to the supervisor stack 


It sets the return status field in register ROof the calling procedure to OIXz, indicating that a mode 
and stack switch has occurred. 


The processor remains in the supervisor mode until a return is performed from the procedure 
that 
caused the original mode switch. While in the supervisor mode, either the local call instructions (call 
and callx) or the calls instruction can be used to call supervisor procedures. 


(The call and callx instructions call local (or user) procedures in user mode and supervisor procedures 
in supervisor mode. There is no stack or processor state switching associated with these instructions.) 


When a ret instruction is executed and the return status field is set to 0 IXz, the processor performs 
a supervisor return. Here, the processor switches from the supervisor stack to the local stack, and the 
execution mode is wtich from supervisor to user. 


When using the user-supervisor 
mechanism, the processor maintains separate stacks in the address 
space, one for procedures executed in the user mode (local procedures) 
and another for procedures 
executed in the supervisor mode (supervisor procedures). When in the user mode, the local procedure 
stack described at the beginning of this section is used. When a supervisor call is made, the processor 
switches to the supervisor stack. It continues to use the supervisor stack until a return is made to the 
user mode. 


The structure of the supervisor stack is identical to that of the local procedure stack (shown in Figure 
5). The processor obtains the SP for the supervisor stack from the system procedure table. When a 
supervisor call is executed while in the user mode (causing a switch to the supervisor 
stack), the 
processor aligns this SP to the next 64 byte boundary to form the new FP for the supervisor stack. 
When a local call or supervisor call is made while in the supervisor mode, the processor 
aligns the 
SP in the current frame of the supervisor stack to the next 64 byte boundary to form the FP pointer. 
This operation allows supervisor procedures to be called from supervisor procedures. 


The user-supervisor 
has three basic uses in an embedded system application: 


l. 
to allow the modpc instruction to be used, 


2. 
to allow kernel code to use a separate stack from the applications 
code, and 


3. 
to allow an external memory management unit (MMU) to provide protection for kernel code and 
data. 


If an application 
does not require any of the above features, it can be designed to not use the user- 
supervisor protection model. Here, all procedure calls are to local procedures. 
If the system table is 
used, all the entries must be the local type (i.e. entry type OOz). 


inter 


If access to the mod pc instruction is required, but the other two features are not, it is suggested that 
the system be designed to always run in supervisor mode. At initialization, 
the processor automati- 
cally places itself in supervisor mode, prior to executing 
the first instruction. 
The processor 
then 
remains in supervisor mode indefinitely, as long as no action is taken to change the execution mode 
to user mode (i.e. using the modpc 
instruction 
to change the execution 
mode bit of the process 


controls to 0). With this technique, all of the procedure calling instructions (call, calix, and calls) can 
be used. The processor 
only uses one stack, which is considered 
the supervisor 
stack. It gets the 
supervisor stack pointer from local register r2. (Prior to making the first procedure call, the supervisor 
stack pointer must be loaded into r2). 


The processor does not support the last use of the user-supervisor 
protection model directly. In other 
words, the processor does not provide a pin or other device that indicates to external hardware when 
a mode switch has occurred. Several techniques are available to perform this operation, which are 
beyond the scope of this discussion. 


The bal (branch and link) and balx (branch and link extended) 
instructions 
provide an alternate 
method of making procedure calls. These instructions save the address of the next instruction (RIP) 
in a specified location, then branch to a target instruction or set of instructions. The state of the local 
registers and stack remains unchanged. 
(For the bal instruction, the RIP is automatically 
stored in 
global register g14; for the balx instruction, 
the location of the RIP is specified with one of the 
instruction operands.) 


A return is accomplished 
with a bx (branch extended) instruction, 
where the address of the target 
instruction is the one saved with the branch and link instruction. 


Branch and link procedure calls are recommended 
for calls to procedures that (1) do not call other 
procedures 
(i.e. for procedure calls that do not result in nesting of procedures) 
and (2) do not need 
many local variables (i.e. allocation of a new set oflocal registers does not provide any benefit). Here, 
local registers as well as global registers can be used for parameter passing. 


This section describes the data types that the 80960KB processor recognizes 
and the addressing 
modes that are available for accessing memory locations. 


The processor defines and operates on the following data types: 


Integer(8, 
16,32 and 64 bits) 


Ordinal (8, 16, 32 and 64 bits) 


Real (32, 64 and 80 bits) 


Decimal (ASCII digits) 
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Bit Field 


• 
Triple-Word (96 bit) 


Quad-Word (128 bit) 


Note 


The real and decimal 
data types are not defined 
in the 80960 architecture. 
They are supported 
in the 
80960KB 
processor, 
but not in the 80960KA 
processor. 


The integer, ordinal, real, and decimal data types can be thought of as numeric data types because 
some operations on these data types produce numeric results (e.g. add, subtract). 


The remaining data types (bit field, triple word, and quad word) represent groupings of bits or bytes 
that the processor can operate on as a whole, regardless of the nature of the data contained 
in the 
group. These data types facilitate the moving of blocks of bits or bytes . 
. 


Integers are signed whole numbers, which are stored and operated on in two's complement 
format. 


The processor recognizes four sizes of integers: 8 bit (byte integers), 
16 bit (short integers), 32 bit 


(integers) and 64 bit (long integers). Figure 9 shows the formats for the four integer sizes and the 
ranges of values allowed for each size. 


SIGN 


~~ 


SIGN 


~~~ 


DATA 
TYPE 


BYTE 
INTEGER 


SHORT 
INTEGER 


INTEGER 


LONG 
INTEGER 


RANGE 
-27T027-1 
-215T0215 
-1 
-2 31TO 231 -1 
_263 TO 263_1 


DECIMAL 
EQUIVALENT 
-128 TO 127 
-32,768 TO 32,767 
-2.14x 109 T02.14x 
109 


-9.22 x 1018 TO 9.22 x 1018 


Ordinals are a general-purpose 
data type. The processor recognizes four sizes of ordinals: 8 bit (byte 
ordinals), 
16 bit (short ordinals), 32 bit (ordinals), and 64 bit (long ordinals). 
Figure 10 shows the 
formats for the four ordinal sizes and the ranges of numeric values allowed for each size. 


DATA 
TYPE 


BYTE 
ORDINAL 


SHORT 
ORDINAL 


ORDINAL 


LONG 
ORDINAL 


RANGE 


OT028 
-1 


OT0216_1 


OT0232 
-1 


OT0264 
-1 


DECIMAL 
EQUIVALENT 


o TO 255 
o TO 65.535 
o TO 4.29 
x 109 


o TO 1.84 x 1019 


The processor uses ordinals for both numeric and non-numeric 
operations. For numeric operations, 
ordinals are treated as unsigned whole numbers. The processor provides several arithmetic instruc- 
tions that operate on ordinals. For non-numeric 
operations, ordinals contain bit fields, byte strings, 
and Boolean values. 


When ordinals are used to represent Boolean values, 12 represents a TRUE and a 02 represents 
a 


FALSE. 


Reals are floating-point 
numbers. The processor recognizes three sizes of reals: 32 bit (reals), 64 bit 
(long reals) and 80 bit (extended reals). The real-number format conforms to ANSI/IEEE 
Std. 754- 
1985, the IEEE Standard For Binary Floating-Point 
Arithmetic. 
Real numbers are discussed 
in 
greater detail in Section 11. 


The processor provides three instructions that perform operations on decimal values when the values 
are presented in ASCII format. Figure 10 shows the ASCII format. Figure 11shows the ASCII format 
for decimal digits. Each decimal digit is contained in the least-significant 
byte of an ordinal (32 bits). 


The decimal digit must be of the form 00 1lddddz, where ddddz is a binary-coded 
decimal value from 


o to 9. For decimal operations, 
bits 8 through 31 of the ordinal containing 
the decimal digit are 
ignored. 


ASCII FORMAT 


31 
7 
0 


The processor provides several instructions that perform operations on individual bits orfields of bits 
within an ordinal (32 bit) operand. Figure 12 shows these data types. 
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31 
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LBIT NUMBER OF 
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BIT. 


An individual bit is specified for a bit operation by giving its number in the ordinal in which it resides. 
The least-significant 
bit of a 32-bit ordinal is bit 0; the most-significan 
bit is bit 31. 


A bit field is a contiguous 
sequence of bits of from 0 to 32 bits in length within a 32-bit ordinal. A 


bit field is defined by giving its length in bits and the bit number of its lowest-numbered 
bit. 


Triple and quad words refer to consecutive bytes in memory or in registers: a triple word is 12 bytes 
and a quad word is 16bytes. These data types facilitate the moving of blocks of bytes. The triple-word 
data type is useful for moving extended-real 
numbers (80 bits). 
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The quad-word instructions (ldq, stq, and movq) offer the most efficient way to move large blocks 
of data. 


The processor 
provides 
instructions 
for moving blocks of data values of various lengths from 
memory to registers (load) and from registers to memory (store). The allowable sizes for blocks are 
bytes, half-words 
(2 bytes), words (4 bytes), double words, triple words, and quad words. For 
example, thestl 
(store long) instruction stores an 8-byte (double word) block of data in memory. 


When a block of data is stored in memory, the least-significant 
byte of the block is stored at a base 
memory address and the more significant bytes are stored at successively 
higher addresses. 


When loading a byte, half-word, or word from memory to a register, the least-significant 
bit of the 
block is always loaded in bit 0 of the register. When loading double words, triple words, and quad 
words, the least-significant 
word is stored in the base register. The more significant words are then 
stored at successively higher numbered registers. Double words, triple words, and quad words must 
also be aligned in registers to natural boundaries as described in the section "Register Alignment". 


Bits can only be addressed in data that resides in a register. Bit 0 in a register is the least-significant 
bit and bit 31 is the most-significant 
bit. 


The processor offers 11 modes for addressing operands. These modes are grouped as follows: 


Literal 


Register 


Absolute 


Register Indirect 


Register Indirect with Index 


Index with Displacement 


IP with Displacement 


Most of the instructions use only the first two modes (literal and register). The remaining modes are 
used for memory related instructions. 


Table 8 shows all the addressing modes, a brief description 
of the elements of the address in each 
mode, and the assembly-code 
syntax for each mode. 


Mode 
Description 
Assembler 
Syntax 


Literal 
value 
value 


Register 
register 
reg 


Absolute offset 
offset 
exp 


Register Indirect 
abase 
(reg) 


Register Indirect 
abase + offset 
exp (reg) 
with offset 


Register Indirect 
abase + (index*scale) 
(reg) [reg*scale] 
with index 


Register Indirect 
abase + (index*scale) 
exp (reg) [reg*scale] 
with index and 
+ displacement 
displacement 


Index with 
(index *scale) 
exp [reg*scale] 
displacement 
+ displacement 


IP with 
IP + displacement 
+ 8 
exp (IP) 
displacement 


The processor recognizes two types of literals: ordinal literal and floating-point 
literal. An ordinal 


literal can range from 0 to 31 (5 bits). When an ordinal literal is used as an operand, the processor 
expands it to 32 bits by adding leading zeros. If the instruction defines an operand larger than 32 bits, 
the processor zero-extends 
the value to the operand size. If an ordinal literal is used in an instruction 


that requires integer operands, the processor treats the literal as a positive integer value. 


The processor 
also recognizes 
two floating-point 
literals (+0.0 and + 1.0). These floating-point 
literals can only be used with floating-point 
instructions. As with the ordinal literals, the processor 


converts the floating-point 
literals to the operand size specified by the instruction. 


A few ofthe floating-point 
instructions use both floating-point 
and non-floating-point 
operands (e.g. 


the convert integer-to-real 
instructions). 
Ordinal literals can be used in these instructions for non- 
floating-point 
operands. 


Note 


Floaling-point 
literals 
are not defined 
in the 80960 architecture. 
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A register is referenced as an operand by giving the register number (e.g. gO,r5, fp3). Both floating- 
point and non-floating-point 
instructions 
can reference 
global and local registers 
in this way. 
However, 
floating-point 
registers 
can only be referenced 
in conjunction 
with a floating-point 
instruction. 


Absolute addressing is used to reference a memory location directly as an offset from address 0 of 
the address space, ranging from _231 to 231-1. Typically, an assembler will allow absolute addresses 
to be specified through arithmetic expressions 
(e.g. x + 44), symbolic labels, and absolute values. 


At the machine-level, 
two absolute-addressing 
modes are provided, depending on the instruction 
format (i.e. MEMA or MEMB). For the MEMA format, the offset is an ordinal number ranging from 
o to 2048; for the MEMB format, the offset is an integer (called a displacement) 
ranging from _231 


to 231-1. After evaluating an absolute address, the assembler will convert the address into an offset 
and select the appropriate machine-level 
instruction type and addressing mode. (The machine-level 
addressing modes and instruction formats are described in Appendix B). 


T e register indirect addressing modes allow an address to be specified with an ordinal value (32 bits) 
in a register or with an offset or a displacement 
added to a value in a register. Here, the value in the 
register is referred to as the address base (abase). 


Again, an assembler 
will allow the offset and displacement 
to be specified with an expression 
or 
symbolic 
label, then evalute 
the address 
to determine 
whether 
an offset or a displacement 
is 


appropriate. 


The register indirect with index addressing modes allow a scaled index to be added to the value in 
a register. The index is specified by means of a value placed in a register. This index value is then 
multiplied by the scale factor. The allowable scale factors are 1,2,4,8, 
and 16. 


A scaled index can also be used with a displacement 
alone. Again, the index is contained in a register 
and is multiplied by a scaling constant before the displacement 
is added to it. 
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The IP with displacement 
addressing mode is often used with load and store instructions 
to make 


them IP relative. 


This sect,ions provides an overview of the instruction set for the 80960KB processor. Included is a 
discussion of the instruction format and a summary of the instruction groups and the instructions in 
each group. 


Section 10 gives detailed descriptions 
of each of the instructions. The instructions are listed in this 


section in alphabetical 
order. Included for each instruction are the assembly-language 
format, the 


action taken when the instruction is executed, and examples of how the instruction might be used. 


Appendix C provides a detailed description of the factors that affect instruction timing. It also gives 
the number of clock cycles required for each instruction. 


The instructions are referred tc;>by their assembly-language 
mnemonics. For example, the add ordinal 


instruction is referred to as the addo instruction. 


An assembly-language 
statement 
consists of an instruction 
mnemonic, 
followed by from 0 to 3 


operands, separated by commas. The following example shows the assembly-language 
statement for 


the addo instruction: 


Here, the ordinal operands in global registers g5 and g9 are added together and the result is stored 
in g7. 


A detailed description of the nomenclature 
used to describe assembly-language 
instructions is given 
in Section 10. 
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At the machine level of the processor, all instructions are word aligned. Most of the instructions are 
one word long, although some addressing modes make use of a two-word format. 


There are four instruction formats: register (REG), compare and branch (COBR), control (CTRL), 
and memory (MEM). Each instruction uses one of these formats, which is determined by the opcode 
field of the instruction. 


The 80960KB processor implements all the instructions in the 80960 instruction set, which includes 
all of the data movement, arithmetic, logical, and program control instructions commonly found in 
computer architectures. 
The processor also includes a set of floating-point 
instructions and several 
instructions to handle architectural 
extensions found in the processor. 


Data Movement 


Arithmetic 
(Ordinal and Integer) 


Logical' 


Bit and Bit Field 


Comparison 


Branch 


Call/Return 


Fault 


Debug 


Processor Management 


The instruction-set 
extensions 
found in the 80960KB processor 
include the following 
groups of 
instructions: 


Integer to Real Conversion 


Floating Point 


Synchronous 
Move and Load 


Decimal. 


Table 9 and 10give a summary ofthe 80960 instructions and the 80960KB instruction-set extensions, 
respectively. The actual number of instructions is greather than those shown in this list, because for 
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some operations, 
several different instructions 
are provided to handle different operand size, data 
types, or branch conditions. 


Data Movement 
Arithmetic 
Logical 
Bit and Bit 
Field 


Load 
Add 
And 
Set Bit 
Store 
Subtract 
Not And 
Clear Bit 
Move 
Multiply 
And Not 
Not Bit 
Load Address 
Divide 
Or 
Check Bit 
Remainder 
Exclusive Or 
Alter Bit 
Modulo 
Not Or 
Scan For Bit 
Shift 
Or Not 
Scan Over Bit 
Extended 
Nor 
Extract 


: 
Multiply 
Exclusive Nor 
Modify 
Extended 
Not 


I 


Divide 
Nand 
Rotate 


Comparison 
Branch 
Call/Return 
Fault 


Compare 
Unconditional 
Call 
Conditional 
Fault 
Conditional 
Branch 
Call Extended 
Synchronize 
Faults 
Compare 
Conditional 
Branch 
Call System 
Compare and 
Compare and 
Return 
Increment 
Branch 
Branch and Link 
Compare and 
Decrement 


Debug 
Processor 
Miscellaneous 


Modify Trace 
Modify Arithmetic 
Atomic Add 
Controls 
Controls 
Atomic Modify 
Mark 
Modify Process 
Scan Byte For 
Force Mark 
Controls 
Equal 
Flush Local 
Registers 
Test Condition 
Code 
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Conversion 
Floating Point 
Synchronous 
Decimal 


Convert Real to 
Move Real 
Synchronous 
Load 
Move 
Integer 
Add 
Synchronous 
Move 
Add With Carry 
Convert Integer to 
Subtract 
Subtract With Carry 
Real 
Multiply 
Divide 
Remainder 
Scale 
Round 
Square Root 
Sine 
Cosine 
Tangent 
Arctangent 
Log 
Log Binary 
Log Natural 
Exponent 
Classify 
Copy Real Extended 
Compare 


The following sections give a brief overview of the instructions in each of these groups. The floating- 
point instructions 
are described in Section II. 


The data movement instructions include those instructions that move data from memory to the global 
and local registers; that move data from the global and local registers to memory; and that move data 
among these registers. 


The load instructions (listed below) copy bytes or words from meory to a selected register or group 
of registers: 


Id 
load 


Idob 
load byte ordinal 


Idos 
load short ordinal 


intel 


Idib 


Idis 


Idl 
Idt 


Idq 


load byte integer 


load short integer 


load long 


load triple 


load quad 


For the Id, Idob, Idos, Idib, and Idis instructions, 
a memory address and a register are specified in 
the instruction and the value at the memory address is copied into the register. Zero and sign extending 
is performed automatically 
for byte and short (half-word) operands. 


The Id, Idl, Idt, and Idq instructions 
copy 4, 8, 12, and 16 bytes from memory into successive 
registers. 


Note 


When using the load, store, and move instructions 
that move 8, 12, or 16 bytes at a time, the rules for 
register 
alignment 
must be followed. 
Refer to the section 2, "Register 
Alignment" 
for a discussion 
of these 


rules. 


For each load instruction there is a correponding 
store instruction (list below), which copies bytes 


or words from a selected register or group of registers to memory: 


st 


stob 


stos 


stib 


stis 


stl 


sU 
stq 


store 


store byte ordinal 


store short ordinal 


store byte integer 


store short integer 


store long 


store triple 


store quad 


For the st, stob, stos, stib, and stis instructions, 
a register and memory address are specified in the 
instruction and the value in the register is copied into memory. For the byte and short instructions, 
the value in the register is automatically 
reformatted 
for the shorter memory location. For the stib 
and stis instructions, 
this reformatting 
can lead to overflow if the register value is too large to be 
represented 
in the shorter memory location. 
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The move instructions, listed below, copy data from a register or group of registers to another register 
or group of registers. 


mov 


movl 


movt 


movq 


move word 


move long word 


move triple word 


move quad word 


These move instructions can only be used to move data among the global and local registers. A set 
of move-real instructions 
(movr, movrl, and movre) are provided for moving real number values 
between the global and local registers and the floating-point registers. The move-real instructions are 
described in Section II. 


The Ida instruction computes an effective address in the address apce from an operand presented in 
one of the addressing modes. A common use of this instruction is to load a constant into a register. 


Table 11 lists all the arithmetic operations for which the 80960KB processor provides instructions 
and the data types that the instructions 
operate on. An "X" in this table indicates that the 80960 
architecture provides an instruction for the specified operation and data types; an "E" indicates that 
an 80960KB instruction-set 
extension provides an instruction for the specified operation and data 
types. An "E*" indicates that the specified operation can be performed on the specified data type 
using 80960KB extended instructions, but that a unique instruction for this operation is not provided. 
For example, a specific instruction is not provided to add two extended-real 
values. However, this 
operation can be carried out with either the add real (addr) or the add long real (addrl) 
instruction. 


With two exceptions, 
all the processor's 
arithmetic 
operations 
are carried out on operands 
in 
registers. The processor does not provide instructions that perform arithmetic operations on operands 
in memory. 


The two instructions 
that are exceptions 
are the atadd 
(atomic ad) and at mod (atomic modify) 
instructions, 
which are discussed later in this section. 


A summary of the arithmetic instructions for real (floating-point) 
data types is provided in Section 
11. The following sections describe the arithmetic instructions for ordinal and integer data types. 


Arithmetic 
Integer 
Ordinal 
Real 
Long 
Extended 
Operations 
Real 
Real 


Add 
X 
X 
E 
E 
E* 


Subtract 
X 
X 
E 
E 
E* 


Multiply 
X 
X 
E 
E 
E* 


Divide 
X 
X 
E 
E 
E* 


Remainder 
X 
X 
E 
E 
E* 


Modulo 
X 


Shift Left 
X 
X 


Shift Right 
X 
X 


Shift Right 
X 
Dividing 


Scale 
E 
E 
E* 


Round 
E 
E 
E* 


Square Root 
E 
E 
E* 


Sine 
E 
E 
E* 


Cosine 
E 
E 
E* 


Tangent 
E 
E 
E* 


Arctangent 
E 
E 
E* 


Exponent 
E 
E 
E* 


Log 
E 
E 
E* 


Log Binary 
E 
E 
E* 


Log Epsilon 
E 
E 
E* 


Classify 
E 
E 
E* 


Copy Sign 
E 


Copy Reversed 
E 


Sign 


The following 
instructions 
perform add, subtract, multiply, or divide operations 
on integers and 
ordinals: 


addi 


addo 


subi 


subo 


rnuli 


mulo 


divi 


divo 


add integer 


add ordinal 


subtract integer 


subtract ordinal 


multiply integer 


multiply ordinal 


divide integer 


divide ordinal 


I 


These instructions 
perform operations on one-word operands in registers and store the results in a 
register. 


The following 
four instructions 
are provided 
to support 
extended 
arithmetic 
operations 
to be 
performed 
(i.e. arithmetic operations on operands greater than one word in length): 


addc 
add ordinal with carry 


subc 
subtract ordinal with carry 


ernul 
extended multiply 


ediv 
extended divide 


The addc and subc instructions add or subtract two words (contained in registers) plus a condition 
code bit (used as a carry bit). If the result has a carry, the carry bit in the condition code is set. Also, 
a second condition 
code bit is set if the operation 
would have resulted 
in an integer overflow 
condition. (The three-bit condition code is contained in the arithmetic controls as described in Section 
2.) 


These instructions treat the operands as ordinals, however, the indication of overflow in the condition 
code facilitates a software implemenetation 
of extended-integer 
arithmetic. 


The ernul instruction multiplies two ordinals (each contained in a register), producing long ordinal 
result (stored in two registers). The ediv instruction divides a long ordinal by an ordinal, producing 
an ordinal quotient and an ordinal remainder. 


remi 
remainder integer 


remo 
remainder ordinal 


modi 
modulo integer 


The different between the remainder and modulo instruction lies in the sign of the result. For the remi 
and remo instructions, the result has the same sign as the dividend; for the modi instruction, the result 
has the same sign as the divisor. 


shlo 
shift left ordinal 


shro 
shift right ordinal 


shli 
shift left integer 


shri 
shift right integer 


shrdi 
shift right dividing integer 


These instructions shift the operand a specified number of bits to the left or to the right. The shlo, shli, 
shro, and shrdi instructions are equivalent to multiplying (shift left) or dividing (shift right) by the 
power of 2. Bits shifted beyond the register boundary are discarded. 


The shri instruction performs a conventional 
arithmetic shift right. However, when this instruction 
is used to divide an integer operand by the power of 2, it produces an incorrect quotient for negative 
operands. (The shrdi instruction produces the correct quotient when this divide operation is used on 
negative operands.) 


The rotate instruction 
rotates the bits of the operand to the left (toward higher significance) 
by a 
specified number of bits. Bits shifted beyond the left boundary of the register (bit 31) appear at the 
right boundary (bit 0). 


and 


notand 
andnot 


A andB 


(not A) and B 


A and (not B) 


not (A=B) 


xnor 


not 


notor 


ornot 


nand 


AorB 


(not A) and (not B) 
A=B 


notA 


(not A) or B 


A or (not B) 


(not A) or (not B) 


The processor provides several types of instructions 
that are used to compare two operands. The 
following sections describe the compare instructions for ordinal and integer data types. The compare 
instructions 
for real data types are discussed in Section 11. 


The compare instructions listed below, compare two operands then set the condition-code 
bits in the 
arithmetic controls according to the results. 


empi 


em po 


eonempi 


eonempo 


compare integer 


compare ordinal 


conditional 
compare integer 


conditional 
compare ordinal 


The condition-code 
bits are set to indicate whether one operand is less than, equal to, or greater than 
the other operand. (Refer to Section 2, "Functions of the Arithmetic Controls Bits" for a discussion 
of meanings of the condition-code 
bits for conditional 
operations.) 


The empi and empo instructions simply compare the two operands and set the condition-code 
bits 
accordingly. 


The eonempi and eonempo instructions first check the status of bit 2 of the condition code. If it is 
not set, the operands 
are compared 
as with the em pi and em po instructions. 
If bit 2 is set, no 
comparison 
is performed and the condition-code 
bits are not changed. 


The conditional compare instructions are provided specifically to optimize two-sided range compari- 
sons to check if A is between Band C (i.e., B :s;A:s;C). Here, a compare instruction (empi or empo) 
is used to check one side of the range (e.g. A ~ B) and a conditional compare instruction (eonempi 
or eonempo) is used to check the other side (e.g., A:s;C) according to the result-of the first comparison. 


The following 
instructions 
compare 
two operands, 
set the condition-code 
bits according 
to the 
results, then increment or decrement one of the operands: 


cmpinci 


cmpinco 


cmpdeci 


cmpdeco 


comapre and increment integer 


compare and increment ordinal 


compare and decrement integer 


compare and decrement ordinal 


The branch instructions allow the direction of program flow to be changed by explicitly modifying 
the IP. The processor provides three types of branch instructions: 


unconditional 
branch 


conditional 
branch 


compare and branch 


Most of the branch instructions specify the target IP by specifying a signed displacement 
to be added 
to the current IP. Other branch instructions specify the memory address of the target IP using one of 
the processor's 
addressing modes. This latter group of instructions are called extended-addressing 
instructions 
(e.g., branch extended, branch and link extended). 


b 


bx 


bal 


balx 


Branch 


Branch Extended 
Branch and Link 


Branch and Link Extended 


The band 
bx instructions cause program execution to jump to the specified target IP. As described 
in Section 10, these two instructions perform the same function; however, they use different machine- 
level instruction formats. 


The bal and balx instructions 
store the address of the next instruction in a specified register; then 
jump to the specified target IP. (For the bal instruction, the RIPis automatically stored in register G 14; 
for the balx instruction the location of the RIP is specified with an instruction operand.) As described 


intel 


in Section 3, the branch and link instructions provide a method of performing 
procedure caBs that 
does not use the processor's 
caB/return mechanism. 
Here, the saved instruction address is used as a 
return IP. 


The bx and balx instructions can be made IP-relative by using the IP with displacement 
addressing 
mode. 


With the conditional branch (branch if) instructions, the processor checks the condition-code 
bits in 
the arithmetic controls. If these bits match the value specified with the instruction, 
the processor 
jumps to the target IP.These instructions use the displacement plus IPmethod of specifying the target 
IP: 


be 


bu 


bl 


ble 


bg 


bge 
bo 


buo 


branch if equal 


branch if not equal 


branch if less 


branch if less or equal 


branch if greater 


branch if greater or equal 


branch if ordered 


branch if unordered 


(Refer to Section 2, "Functions of the Arithmetic Controls Bits" for a discussion of meanings of the 
condition-code 
bits for conditional 
operations.) 


The bo and buo instructions 
refer to comparisons 
of real numbers. Ordered and unordered 
real 
numbers are described in Section II. 


The compare and branch instructions compare two operands, then branch according to the results. 
There are three subtypes of instructions in this group: compare integer, compare ordinal and check 
bit: 


cmpibe 


cmpibue 


cmpibl 


cmpible 


cmpibg 


cmpibge 


cmpibo 


compare integer and branch if equal 


compare integer and branch if not equal 


compare integer and branch if less 


compare integer and branch if less or equal 


compare integer and branch if greater 


compare integer and branch if greater or equal 


compare integer and branch if ordered 


cmpibno 


cmpobe 


cmpobne 


cmpobl 


cmpoble 


cmpobg 


cmpobge 


bbs 


bbc 


compare integer and branch if unordered 


compare ordinal and branch if equal 


compare ordinal and branch if not equal 


compare ordinal and branch if less 


compare ordinal and branch if less or equal 


compare ordinal and branch if greater 


compare ordinal and branch if greater or equal 


check bit and branch if set 


check bit and branch if clear 


With the compare-ordinal-and-branch 
and compare-integer-and-branch 
instructions, 
two operands 
are compared and the condition-code 
bits are set, as with the compare instructions described earlier 
in this section. A conditional 
branch is then executed as with the conditional 
branch (branch if) 
instruction. 


With the check-bit-and-branch 
instructions, 
one operand specifies a bit to be checked in the other 


operand. The condition-code 
bits are set according to the state ofthe specified bit (i.e. 0102 ifthe bit 


is set and 0002 if the bit is clear). A conditional 
branch is then executed according to the setting of 


the condition-code 
bits. 


set bit 
cIrbit 


notbit 


chkbit 


alterbit 


scanbit 


spanbit 


set bit 


clear bit 


not bit 


check bit 


alter bit 


scan for bit 


span over bit 


The setbit, cIrbit, and notbit instructions 
set, clear, or complement 
(toggle) a specified bit in an 


ordinal. 


The chkbit instruction causes the condition-code 
bits to be set according to the state of a specified 
bit in a register. The condition code is set to 0102 if the bit is set and 0002 otherwise. 


The alterbit 
instruction alters the state of a specified bit in an ordinal according to the condition code. 


If the condition code is 0102, the bit is set; if the condition code is 0002, the bit is cleared. 


The scanbit 
and span bit instructions find the most significant set bit and clear bit, respectively, 
in 
an ordinal. 


There are two bit field instructions extract 
and modify. The extract 
instruction converts a specified 
bit field, taken from an ordinal value, into an ordinal value. In essence, this instruction shifts a bit field 
in a register to the right and fills in the bits to the left of the bit field with zeros. 


The modify instruction copies bits from one register, under control of a mask, into another register. 
Only the unmasked bits in the destination register are modified. 


The scan byte instruction performs a byte-by-byte 
comparison 
of two ordinals to determine if any 
two corresponding 
bytes are equal. The condition 
code is set according 
to the results of the 
comparison. 


Data can be converted from one length to another by means of the load and store instructions. 
For 
example, 
the ldis instruction 
loads a short integer from memory to a register and automatically 
converts the integer from a half word to a full word. 


The 80960KB extended instruction set provides instructions to perform conversions between integer 
and real data types:..These instructions are described in Section 11. 


The processor 
offers 
an on-chip 
call/return 
mechanism 
for making 
procedure 
calls to local 
procedures and kernel procedures. This call/return mechanism is describe in detail in Section 3. The 
following four instructions 
are provided to support this mechanism. 


call 
call 


calix 
call extended 


calls 
call system 


ret 
return 


The call and calix instructions 
call local procedures. 
The call instruction 
specifies 
the target 
procedure (the first instruction of the procedure) by adding a signed displacement 
to the IP.The calIx 


instruction 
uses extended addressing, 
as described for the bx and balx instructions, 
to specify the 


target procedure. For both of these instructions, 
a new set of local registers and a new stack frame 
are allocated for the called procedure. 


The calls instruction operates similarly to the call and calix instructions, except that it gets its target 
procedure address from the system procedure table. An index number ,included as an operand in the 
instruction provides an entry point into the procedure table. 


Depending on the type of entry being pointed to in the procedure table, the calls instructions can cause 
a supervisor call to be executed. A supervisor call causes the processor to switch to the supervisor 
stack and to switch to supervisor mode. The supervisor call is described in detail in Section 3. 


The ret instruction performs a return from a called procedure to the calling procedure (the procedure 
that made the call). This instruction obtains its target IP (return IP) from linkage information that was 
saved for the calling procedure. The ret instruction is used to return from local and supervisor calls 
and from implicit calls to interrupt and fault handlers. 


The atomic instructions perform read-modify-write 
operations on operands in memory. They insure 


that an operation on a specified memory location is completed before another agent with access to 
memory is allowed to access that memory location. These instructions 
are particularly 
useful in 
systems in which several agents have access to system memory. 


There are two atomic instructions: 
atomic add (atadd) 
and atomic modify (atmod). 
The atadd 
instruction causes an operand to be added to the value in the specified memory location. The atmod 
causes bits in the specified memory location to be modified under control of a mask. 


Generally, 
the processor 
generates 
faults automatically 
as the result of certain operations. 
Fault 


handling routines are then invoked to handle the various types of faults without explicit intervention 
by the currently running process. (Faults are discussed in detail in Section 8). 


The following conditional fault instructions permit a fault to be generated explicitly according to the 
state of the condition-code 
bits: 


faulte 


faultne 


faultl 


faultle 


faultg 


fault if equal 


fault if not equal 


fault if less 


fault if less or equal 


fault if greater 


inter 


faultge 


faulto 


faultno 


fault if greater or equal 


fault if ordered 


fault if unordered 


The processor 
supports debugging 
and monitoring 
of program activity through the use of trace 
events. The following instructions 
support these debugging and monitoring 
tools: 


modtc 


mark 


fmark 


modify trace controls 


mark 


force mark 


The trace functions are controlled through the processor's 
trace controls bits. Some of these bits allow 
various types of tracing to be enabled or disabled. Other bits act as flags to indicate when an enabled 
trace event has been detected. (Trace controls are described in detail in Section 9.) 


The mark instruction causes a breakpoint trace event to be generated if the breakpoint trace mode 
is enabled. The fmark instruction 
generates 
a breakpoint 
trace independent 
of the state of the 
breakpoint trace mode flag. The latter two instructions allow a breakpoint to be placed anywhere in 
a program. 


The modpc instruction 
provides a method of reading and modifying 
the contents of the process 
controls. 


In certain instances, it is necessary to insure that the contents of the local-register save area of the stack 
frames are the same as the local registers. The flush local registers instruction (flushreg) 
automati- 
cally stores the contents of all the local register sets, except the current set, in the register save area 
of their associated stack frames. 


The arithmetic controls cannot be addressed with the load, move, and store instructions 
or the bit 
instructions. 
Instead, special instructions are provided for this purpose. 


The modify arithmetic controls instructions (modac) permits bits in the arithmetic controls register 
to be modified under the control of a mask. 


inter 


teste 


testne 


testl 


testle 


testg 


testge 


testo 


test no 


test if equal 


test if not equal 


test if less 


test if less or equal 


test if greater 


test if greater or equal 


test if ordered 


test if unordered 


These instructions 
cause a TRUE (0102) to be stored in a destination register if the condition code 
matches the condition 
specified with the instruction. 
Otherwise, 
a FALSE (0002) is stored in the 
register. 


The following non-floating-point 
instructions 
are extensions to the 80960 architecture 
instruction 
set. The synchronous 
load and move instructions are provided in both the 80960KB and 80960KA 
processor; the decimal instructions 
are provided only in the 80960KB processor. 


The processor's 
store instructions 
are executed asynchronously 
with the memory controller. Once 
the processor sends data out its bus for storage in main memory, it continues with the next instruction 
in the instruction 
stream, assuming that its bus control logic will carry out the operation. 


The 80960KB processor provides four special instructions for performing 
memory operations that 
perform store and move operations synchronously 
with memory. 


The synchronous 
load instructions 
(synld) loads a word from a register into memory. When this 
instruction is performed, the processor waits until a condition code bit is set in the arithmetic controls, 
indicating that the operation has been completed, before it begins executing the next instruction. 


The synchronous move instructions (synmov, synmovl, and synmovq) perform synchronous moves 
of data from one location in memory to another. 


dmovt 


daddc 


dsubc 


move and test decimal 


decimal add with carry 


decical subtract with carry 


The instructions operate on 32-bit decimal operands that contain an 8-bit, ASCII-coded 
decimal in 
the least-significant 
byte of the word (as shown in Figure 11). 


The dmovt instruction moves a decimal operand from one register to another and tests the least 
significant byte of the operand to determine if it is a decimal digit (0 to 9). It sets the condition code 
according to the results of the test: 0102 if the operand contains a decimal digit and 0002 otherwise. 


The daddc and dsubc instructions operate similarly to the addc and subc instructions. They add or 
subtract two decimal digits plus bit 1 of the condition code (used as a carry-in bit). If the operation 
produces a decimal carry, the condition code is set accordingly. The subtraction operation is carried 
out in lO's complent arithmetic. 


With the 80960KB processor, the most efficient method of multiplying or dividing decimal numbers 
is to convert them into extended-real numbers and use the mulr and divr instructions. Decimal values 
of up to 18 decimal digits can be handled with this technique. 


This section describes the facilities for initializing 
and managing 
the operation 
of the 80960KB 
processor. Included is a description of the processor-management 
facilities and the steps required to 
initialize the processor. Appendix D gives a listing of the necessary 80960KB code to initialize the 
processor. 


This section and sections 7, 8, 9, and 12 describe the 80960KB's 
processor-management 
facilities. 


These facilities 
are primarily 
software-related, 
although 
some hardware 
considerations 
are also 
discussed. 


For the purpose of discussion in these sections, it assumed that the processor is going to execute a 
program made up of a system kemal (or executive) 
and applications 
code. This program may be 
located in ROM or RAM. 


Such a program has the following facilities available to it to initialize, communicate 
with, and control 
the processor: 


Instruction List 


System Data Structures 


inter 


Interrupts 


lACs 


Faults 


These 
facilities 
allow system hardware 
and the kernel to initialize 
the processor 
and initiate 
instruction execution. They also provide software or external agents with methods of interrrupting 
the processor to service external I/O devices. 


At the most rudimentary 
level, the processor is controlled through a stream of instructions that the 
processor fetches from memory and executes one at a time. Once the processor is initialized, it begins 
executing instructions 
and continues until it is stopped. 


The processor defines several system data structures that reside in memory. These data structures 
(shown in Figure 13) offer a means of configuring 
the processor to operate in a specific way. 


INITIAL 
MEMORY 
IMAGE 


(1M I) 


STACK 
POINTER 
lOCATED 
IN lOCAL 
REGISTER 
r1 


The system data structures can be located anywhere in the processor's 
address space. The processor 
gets pointers to most of these data structures 
from the initial memory 
image (IMI). The IMI is 
described later in this section in "Initial Memory Image". 


The interrupt table provides pointers to interrupt-handling 
procedures. The interrupt vector numbers 
act asd indices into this table. For the purpose of handling interrupts, a separate interrupt stack is 
maintained 
in the address space. The interrupt mechanism 
is described in Section 7. 


The fault table provides pointers to fault-handling 
procedures. 
When the processor detects a fault, 
it generates a fault vector number internally that provides an index into the fault table. The fault 
mechanism 
is described in Section 8. 


The system procedure table contains pointers to the kernel procedures, which are accessed using the 
system call (calls) mechanism. 
The system table structure 
is described 
in Section 
3, "System 
Procedure Table". 


The processor 
uses two stacks for procedures 
calls: the local procedure 
stack and the (optional) 
supervisor stack. These stacks are described in Section 3. 


The processor 
also contains 
a register, called the process controls 
register, that it uses to store 
information 
about the current state of the processor 
and the program it is executing. The process 
controls are described later in this section under "Process Controls". 


The processor 
defines two methods of asynchronously 
requesting 
services from the processor: 
interrupts and lAC messages. Interrupts are the more common of the two. 


An interrupt is a break in the control flow of a program so that the processor can handle a more urgent 
chore. Interrupt requests are generally sent to the processor from an external source, often to request 
I/O services. When the processor 
receives an interrupt request, it temporarily 
stops work on its 
current task and begins work on an interrupt-handling 
procedure. Upon completion of the interrupt- 
handling procedure, 
the processor generally returns to the task that was interrupted and continues 
work where it left off. 


Interrupts also have a priority, which the processor uses to determine whether to service the interrupt 
immediately 
or to postpone service until a later time. 


The 80960KB processor provides an alternate method of communicating 
with other agents on the 
system bus are able to communicate 
with the processor through messages that are exchanged 
in a 
reserved section o( memory. 
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work on another task. However, 
where an interrupt 
generally 
causes a temporary 
break in the 
execution of a program, an lAC often causes a permanent change in the control flow of the processor. 


While executing instructions, the processor is able to recognize certain conditions that could cause 
it to return an inappropriate 
result or that could cause it to go down a wrong and possibly disastrous 


path. One example of such a condition is a divisor operand of zero in a divide operation. Another 
example is an instruction with an invalid opcode. These conditions are called faults. 


The processor 
handles faults almost the same way that it handles interrupts. When the processor 
detects a fault, it automatically 
stops its current processoring 
activity and begins work on a fault- 
handling procedure. 


The process-controls 
word (shown in Figure 14) contains miscellaneous 
pieces of information 
to 


control processor activity and show the current state of the processor. The various functions of this 
field are described in the following paragraphs. 


TRACE 
ENABLE 


EXECUTION 
MODE 


RESUME 


TRACE-FAULT 
PENDING 


STATE 


PRIORITY 


INTERNAL 
STATE 


~ 
RESERVED 
INITIALIZED 
TO 0 


The execution mode flag determines whether the processor is operating in the user mode (clear) or 
supervisor mode (set). The processor automatically 
sets this bit on a supervisor call and clears it on 


a return from supervisor mode. 


The priority field determines the priority (from 0 to 31) of the processor. When the processor is in 
the executing state, it sets its priority according to this value. 


inter 


State 
Processor 
Field 
State 


0 
Executing 


1 
Interrupted 


This bit tells software whether the processor 


• 
is currently executing a program (0) or 


has been interrupted so it can service an interrupt (1). 


The trace-enable 
and trace-fault-pending 
flags control tracing. The trace-enable 
field determines 


whether trace faults are to be generated (set) or not-generated 
(clear). The trace-fault-pending 
field 
is a flag that the processor uses to determine if a trace event has been detected (set) or not (clear). The 
use of these fields is discussed in detail in Section 9. 


The resume flag signals the processor that an instruction has been suspended. The processor sets this 
flag whenever it suspends an instruction to handle an interrupt or fault. On a return from the interrupt 
or fault handler, the processor checks this flag and performs an instruction resumption action if the 


" flag is set. 


All of the bits in the process controls are set to zero as part of the initialized procedure. Bits 2 through 
8,11,13,15, 
and 21 through 31 are reserved. These bits should not be altered following initialization. 


Modify-process-controls 
instruction (mod pc) 


Alter the saved process controls prior to a return from an interrupt handler 


Alter the saved process controls prior to a return from a fault handler 


In the latter two methods, the kernel changes the process controls in the interrupt or fault record that 
is saved on the stack. On the return from the interrupt or fault handler, the modified process controls 
are copied into the processor's 
internal process controls. 


Note 


Changing 
the saved process 
controls 
by means of a fault handler 
can only be used if the fault handler 
was 
invoked 
by means of an implicit 
supervisor 
call. 


When the process controls are changed as described above, the processor acts on the changes as soon 
as it receives the new information, 
except for the following situation. 


If the modpc instruction is used to change the trace-enable flag, the processor does not guarantee to 
act on the change until after up to four more instructions 
have been executed. 


The processor defines a priority mechanism for determining the order in which programs, interrupts, 
and lACs are worked on. Priorities range from 0 to 31, with 31 being the highest priority. Each 
interrupt vector is assigned a priority. Also, when the processor is executing a program, it sets its 
priority according to the priority field of the process controls. 


Interrupt priorities serve two functions. First, they determine ifthe processor will service an interrupt 
immediately 
or delay servicing it with respect to its current priority. Second, they determine which 
interrupt of several interrupts is serviced first. 


When the processor receives an lAC, it always services it immediately (i.e., treats the lAC as if it has 
a priority of 31). A mechanism is provided that allows priorities to be assigned to lACs. When using 
this mechanism, external hardware is required to intercept all lACs sent to the processor and to check 
their priority. This hardware then determines whether to send the lAC to the processor for servicing 
or delay it according to the current priority of the processor. 


The processor 
has four different 
operating 
states: executing, 
interrupt, 
stopped, 
and stopped- 
interrupted. The processor is placed in one oftwo states (executing or stopped) at initialization. After 
that, the processor and software control the processor's 
state. 


The processor can switch between the executing and interrupted states or between the stopped and 
stopped-interrupted 
states. However, the processor never switches from the executing state to the 


stopped state, unless it detects a series of fault ocnditions that it cannot handle. 


Software can change the state of the processor in either of two ways: (1) issue a reinitialize lAC or 
(2) issue a freeze lAC. The reinitialize lAC forces the processor to reread the pointers from the IMI 
and begin executing instructions from a new IP. The freeze lAC forces the processor into the stopped 
state. 


If the processor is interrupted while in the executing state, it saves the current state of the program, 
switches to the interrupt state, and services the interrupt. Upon returning from the interrupt handler, 
the processor resumes work on the program. 


In the stopped state the processor ceases all activity. The only tasks it can perform while in this state 
are to service an interrupt or an lAC. While servicing an interrupt, the processor 
switches to the 
stopped-interrupt 
state. It then switches back to the stopped state upon completion 
of the interrupt 
routine. Likewise, while servicing an lAC, the processor switches to the stopped-interrupted 
state. 
If the lAC handling action does not result in a change in the processor's 
stte, the processor switches 
back to the stopped state when it finishes the lAC handling action. 


The only way to get the processor out of the stopped state (other than to service an interrupt) is to 
reinitialize the processor, either with a hardware reset or by sending it an external reinitialize lAC. 


When the processor is interrupted while it is in the midst of executing an instruction, it does one of 
three things before it services the interrupt: 


1. 
It completes the instruction. 


2. 
It terminates 
the instruction 
and sets the processor 
state so that it is as if execution 
of that 
instruction had not yet begun. 


3. 
It suspends the instruction and saves the necessary resumption information so that execution of 
the instruction can be continued when the processor begins work on the program again. This 
course of action is generally reserved for instructions that have a long execution time and that 
alter the internal and external processor state as they execute. 


Which of these steps the processor 
takes depends on the instruction 
being executed. 
However, 
whichever 
step it takes is transparent 
to the software. 
The processor 
automatically 
saves the 
necessary state information so that work on the program can be resumed with no loss of -information. 


Refer to the section 7, "Interrupt 
Handling 
Action", 
for more information 
on how resumption 
information 
is saved when an interrupt is services. 


The processor provides a 232-byte address space. This address space can be mapped to read-write 
memory, read-only memory, and memory-mapped 
I/O. (The processor does not provide a dedicated, 
addressable 
I/O space.) 


The address space is linear (or flat): there are no subdivisions of the address space such as segments. 
For the purpose 
of memory 
management, 
an external 
memory 
management 
unit (MMU) 
may 
subdivide memory into pages or restrict access to certain areas of memory to protect kernel code and 
data. But from the point of view of the processor, the address space is linear. 


All of the address space is available for general use except the upper 16M bytes (FFOOOOOOI6to 
FFFFFFFFI6), 
which are reserved for special functions. 
(These functions are described in Section 
12). 


An address in memory is a 32-bit value in the range 0 to FFFFFFFF16• 
It can be used to reference a 
single byte, 2 bytes, 4 bytes, 8 bytes, 12 bytes or 16 bytes of memory depending on the instruction 
being used. (Refere to the descriptions of the load and store instructions in Section 10 for information 
on multiple-byte 
addressing.) 


The processor requires that the memory to which the address space is mapped has the following 
capabilities. 


It must be byte addressable. 


It must support burst transfers (i.e., transfers of blocks of contiguous 
bytes up to 16 bytes in 
length). 


It must guarantee indivisible access (read or write) for memory addresses that fall within 16-byte 
boundaries. 


It must guarantee atomic access for memory addresses that fall within 16-byte boundaries. 


The latter two capabilities 
are required to allow multiple processor 
to share a common memory 
conveniently. 


An indivisible access guarantees that a processor reading or writing a set of memory locations will 
complete the operation before another processor can read or write the same location. The processor 
requires indivisible access within an aligned, 16-byte block of memory. 


An atomic access is read-modify-write 
operation. Here external logic must guarantee that once a 
processor 
beings a read-modify-write 
operation 
on a set of memory 
locations, 
it is allowed 
to 
complete the operation before another processor is allowed to access the same location. 


As described 
above, the processor 
requires 
that when one processor 
is performing 
an atomic 
operation within an aligned, 16-byte block, other processors are delayed from performing 
another 
atomic operation within that block until the first operation has been completed. 


The 80960KB processor provides two features to aid in implementing 
the memory requirements 
described above: SIZE lines and a LOCK line on the local bus. 


The SIZE lines indicate the length of a memory access in bytes. These lines can be used to specify 
1-,2-,4-,8-, 
12-, or 16-byte lengths. When making the multiple-byte access, the processor thus sends 
the memory controller a base address, on the address lines, and a length on the SIZE lines. 


The LOCK line is used to synchronize 
atomic operations. 
When a processor performs an atomic 
operation, it first examines the LOCK line. If it is asserted, the processor waits until the line is not 
asserted (i.e., spins on the LOCK line). If the line is not asserted, the processor asserts the LOCK line 
when it is performing an atomic read and deasserts the line when it performs the companion atomic 
write. 


The processor-management 
facilities described 
earlier in this section allow the processor 
to be 
configured 
and operated in several ways. This section lists the data structures that the kernel must 
supply to operate the processor. 


Other System Data Structures 


Address Space 


Stacks 


Code 


As part of the initialization procedure, a more complete set of system data structures are established 
in memory. These data structures 
include an interrupt table and a fault table. If the system call 
mechanism 
is going to be used, a system procedure table is required. 


Two stacks are also required: an interrupt stack and a local (or user) procedure stack. The initial stack 
pointer for the interrupt stack is given in the 1M!. The initial stack pointer (SP) for the local-procedure 
stack is given in local register r1; the initialization 
code is required to establish the SP value in this 
register. 


If the supervisor call mechanism is to be used, a supervisor stack must also be provided. The initial 
stack pointer for this stack is given in the system-procedure 
table. The supervisor stack can be placed 
anywhere in the address space. 


Note 


"Hints 
on Using the User-Supervisor 
Protection 
Model" 
in section 
3 describes 
an application 
of the user- 


supervisor 
protection 
model, 
in which the processor 
is always 
in supervisor 
mode. When using this 
application, 
the local stack and the supervisor 
stack are the same. The processor 
gets the initial stack 
pointer 
for this stack from register 
r I. 
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Finally, three levels of code are required: initialization code, kernel code, and applications code. The 
initialization code is part of the 1M!. (Appendix D gives an initialization code example.) The starting 
IP for the initialization 
code is also provided in the 1M!. 


This section describes how to initialize the 80960KB processor. It defines the mechanism 
that the 
processor uses to establish its initial state and begin instruction execution. 
It also describes some 
general guidelines 
for writing code to complete 
the initialization 
of the processor 
for specific 
applications. 


Note 


The 80960 architecture 
does not define an initial memory 
image or an initialization 
procedure. 
The 


following 
initialization 
requirements 
are specific 
to the 80960KB 
processor. 


The IMI performs 
three functions 
for the processor: 
(1) it provides 
check-sum 
words that the 
processor uses in its self-test routine at start -up, (2) it provides pointers to the system data structures, 
and (3) it provides scratch space that the processor uses to perform certain internal functions. Figure 
15 shows the structure of the 1M!. 


The IMI is made up of four parts: the check-sum 
word, the system address table (SAT), and the 
processor control block (PRCB), and the initialization 
code. In an embedded application, all of the 
parts of this image will generally be held in ROM, except the scratch space of the PRCB. For this 
reason, the PRCB should be copied from ROM to RAM after system initialization. 
(The reinitialize 
lAC, described in Section 12, is used to give the processor the PRCB pointer for the relocated PRCB.) 


The check-sum 
words must be in memory locations 0000000016 
to 0000001F16. The first of these 
words is a pointer to the base of the SAT. The second word is a pointer to the base of the PRCB. The 
fourth word is the instruction pointer to the first instruction of the initialization 
code. 


The remaining words (word 3 and words 5 through 8) are check words, which must be chosen such 
that the one's complement 
of the sum of the eight words plus FFFFFFFF'6 equals O. 


The SAT is 158 bytes in size and can be located anywhere in the address space. It has four required 
entries. The word beginning at byte 136 must contain a pointer to the base (first byte) of the SAT. This 
pointer is identical to the pointer given in the first word of the check-sum words. The word beginning 
at byte 152 must contain a pointer to the base of the system procedure table. The words beginning 
at byte 140 and 156 must contain 00FCOOFBI6 and 304400FB16, 
respectively. 
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The PRCB is 174 bytes long and can also be located anywhere 
in the address space. It has seven 
required entries and one reserved space. 


The write-external-priority 
flag (bit 31 of the word beginning at byte 4) instructs the processor to 
write the priority of the processor to the lAC message control field whenever an interrupt (not caused 
by an lAC) or the execution of the modpc instruction occurs. When this bit is set, the write-external- 
priority mechanism is enabled; when the bit is clear, the mechanism is disabled. The use of this flag 
is described in Section 12. 


The interrupt table pointer points to the first byte of the interrupt table. The interrupt stack pointer 
points to the top (first available byte) of the interrupt stack. 


The processor uses the scratch space in the IMI for internal functions. This field should be set to all 
zeros at initialization 
or reinitialization 
of the processor and not accessed by software thereafter. 


The remaining fields in the PRCB (bytes 8 through 19, bytes 28 through 31, and bytes 48 through 
79) are reserved. They should be set to all zeros at initialization or restart and not accessed by software 
thereafter. 


The initial instruction list that the processor begins executing following its self test can be located 
anywhere in the address space. 


At initialization 
or on a reinitialize processor lAC, the processor reads the pointers from the IMI in 
memory and caches them. 


In general, to change any of the IMI fields that have been cached on the processor chip, the kernel 
must first modify the IMI in memory, then reinitialize the processor using the reinitialize processor 
lAC. The processor then rereads the IMI and reloads the cached fields in its internal cache. 


The IMI shown in Figure 15 contains the minimum 
data structures required for the processor 
to 
initialize 
itself and begin executing 
code. To build a useful system, however, 
additional 
data 
structures are required, such as an interrupt table, a fault table, a system procedure table, a set ofkemel 
procedures, 
a set of stacks, and a heap. Some of these data structures can be located in ROM along 


with the IMI; however, others must be in RAM because they must be writable. 


Table 13 lists the various system data structures and shows which can be in ROM and which must 
be in RAM. The following paragraphs give the system limitations if a data structure is included in 
ROM. 


Data Structure 
May Be in ROM 
May Be in ROM 
Must Be in RAM 
with Limitations 


IMI 
X 


PRCB 
X 


SAT 
X 


Interrupt table 
X 


Fault table 
X 


Kernel Procedures 
X 


Stacks and heap 
X 


The interrupt table must be in RAM for the processor to operate properly, because it contains the 
interrupt pending fields, which the processor must be able to write to. 


The fault table can be in ROM, providing 
it will never be necessary to relocate the fault handler 


routines. 


Initialization 
of the 80960KB 
processor 
typically is handled in two stages. In the first stage of 


initialization 
the processor performs a self test and reads pointers from the 1M!. During the second 
stage, the processor executes initialization code designed to build the remainder of the memory image 
so that execution of applications 
code can begin. 


The following procedure shows the steps that system hardware and the processor go through in the 
first stage of initialization. 
The algorithm in Figure 16 gives the details of this procedure. 


assert FAILURE pin; 
perform self test; 
if self test fails 
then enter stopped state; 
else 
deassert FAlLURE pin; 
enter predefined state; 
if STARTUP pin = 0 
then enter stopped state; 
else 
x ~ 
memory(O); 
read 8 words beginning 
at address 0 
AC.cc ~ 
0002; 
temp ~ 
FFFFFFFFI6 
add_with3arry 
x(O); 
temp ~ 
temp add_with_carry 
x(1); 
temp ~ 
temp add_with_carry 
x(2); 
temp ~ 
temp add_ with_carry x(3); 
temp ~ 
temp add_ with3arry 
x(4); 
temp ~ 
temp add_ with3arry 
x(5); 
temp ~ 
temp add_with3arry 
x(6); 
temp ~ 
temp add_ with3arry 
x(7); 
iftemp~O 
then 
assert FAlLURE pin; 
enter stopped state; 
else 
prcb_address 
~ 
memory(4); 
lP ~ 
memory (12) 
fetch lMl; 
processor. priority ~ 
31; 
processor. state ~ 
interrupted; 
FP ~ 
lMLinterrupcstack_pointer; 
clear any latched external interrupt/IAC 
signals; 
begin execution; 
endif; 
endif; 
endif; 
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1. 
Hardware asserts the RESET pin on the processor. 


2. 
The processor samples LPN to get its locals processor (lor 0). (LPN and STARTUP are signals 
that come from multiplexed 
information 
received on several processor pins.) 


3. 
The processor asserts the FAILURE pin and performs a self test. If the processor passes the self 
test, it deasserts the FAILURE pin. 


4. 
The processor samples STARTUP to determine whether it is the initializing processor (1) or not 
(0). If the processor is the initializing processor, it continues with the initialization 
procedure; 
if it is not, it goes into the stopped state. (In multiprocessing 
systems, all processors except the 
initializing processor are put in the stopped state.) 


5. 
The processor reads the 8 check-sum 
words and checks that the check sum is O. 


6. 
Using the contents of the check-sum words, the processor determines the location of the SAT, 
the PRCB, and the first instruction to be executed. 


7. 
The processor sets its process priority to 31 (highest possible) and its state to interrupted. 


8. 
The processor clears any latched external interrupt or lAC signals. This means that the processor 
will not service any interrupts or lACs prior to beginning instruction execution. 


9. 
The processor begins execution of the initialization 
instruction list. 


After self test, the processor 
establishes 
its own state. For the initializing 
processor 
this state is 
interrupted; for any other processors in the system this state is stopped. Also at initialization, the trace 
controls are set to zero; the process controls are set to zero (except for the execution mode, which is 
set to supervisor, and the priority, which is set to 31); and the breakpoint registers are disabled. 


Since the processor places itself in the interrupted 
state during the first stage of initialization, 
the 
initialization 
code is essentially a special interrupt-handler 
procedure. 


The processor 
activity during the second stage of initialization, 
which occurs once the processor 
begins instruction execution, is up to software. In general, this stage of initialization 
is used to copy 
to create additional date structure in memory, such as the interrupt table, the system-procedure 
table, 
and the fault table (if not in the initial memory image), and the kernel procedures. 


Appendix D gives an example of the 80960KB code that might be used to carry out this second stage 
of initializatin. 


A common initialization 
technique is to create a new PRCB and interrupt table in RAM along with 
the other system data structures that are placed in memory in the second stage of initialization. 
The 
processor is then reinitialized to point to the PRCB and interrupt table. (The code in Appendix D uses 
this technique.) 
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The processor is reinitialized using the reinitialize lAC. This reinitialize lAC message includes new 
pointers to the SAT and PRCB. The processor reads the new PRCB, then begins instruction execution 
according to the control information contained in the PRCB. 


This section describes the 80960KB processor's 
interrupt handling facilities. It also describes how 
interrupts are signaled. 


An interrupt is a temporary break in the control stream of a program so that the processor can handle 
another chore. Interrupts are generally requested from an external source. The interrupt request either 
contains a vector number or else points to a vector that tells the processor what chore to do while in 
the interrupted state. When the processor has finished servicing the interrupt, it generally returns to 
the program that it was working on when the interrupt occurred and resumes execution where it left 
off. 


The processor provides a mechanism for servicing interrupts, which uses an implicit procedure call 
to a selected interrupt handling procedure, called an interrupt handler. 


When an interrupt occurs, the current state of the program is saved. If the interrupt occurs during an 
instruction that requires many machine cycles, the instruction state is also saved and execution of the 
instruction is suspended. 


The processor then creates a new frame on the interrupt stack and executes an implicit call to the 
interrupt handler selected with the interrupt vector. 


Upon returning from the interrupt handler, the processor 
switches back to the program that was 
running when the interrupt occurred, restores it to the state it was in when the interrupt occurred, and 
resumes work on it. 


Another feature of this interrupt handling mechanism is that it allows interrupts to be prioritized. If 
an interrupt is signaled that has the same or a lower priority than the processor's 
current priority, the 
processor will save the interrupt vector and service the interrupt at a later time. Interrupts that are 
waiting to be serviced are called pending interrupts. 


To use the processor's 
interrupt handling facilities, software must provide the following 
items in 
memory: 


Interrupt Table 


Interrupt Handler Routines 


Interrupt Stack 
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These items are generally established in memory as part of the initialization 
procedure. Once these 
items are present in memory and pointers to them have been entered in the appropriate 
system data 


structures, the processor then handles interrupt automatically 
and independent 
from software. 


Each interrupt vector is 8 bits in length, which allows up to 256 unique vectors to be defined. In 
practice, vectors 0 through 7 cannot be used, and vectors 244 through 251 are reserved and should 
not be used by software. 


Thus, at each priority level, there are 8 possible vectors (vectors 8 through 15 have a priority of I, 
vectors 16 through 23 a priority of2, and so on to vectors 246 through 255, which have a priority of 
31). 


The processor uses the priority of an interrupt to determine whether or not to service the interrupt 
immediately 
or to delay service. If the interrupt priority is greather than the processor's 
current 
priority, the processor services the interrupt immediately; 
if the interrupt priority is equal to or less 
than the processor's 
current priority, the processor saves the interrupt vector as a pending interrupt 


so that it can be serviced at a later time. 


Note that the lowest program priority allowed is O. If the current program has a 0 priority, a priority- 
o interrupt will never be accepted. This is why vectors 0 through 7 cannot be used. In fact, there are 
no entries provided for these vectors in the interrupt table. 


The interrupt table contains instruction pointers (addresses in the address space) to interrupt handlers. 
It must be aligned on a word boundary. The processor determines the location of the interrupt table 
by means of a pointer in the 1M!. 


As shown in Figure 17, the interrupt table contains one entry (i.e., one pointer) for each allowable 
vector. The structure of an interrupt-table 
entry is given at the bottom of Figure 17. Each interrupt 
procedure must begin on a word boundary, so the two least-significant 
bits of the entry are set to O. 


The first 36 bytes of the interrupt table are used to record pending interrupts. This section of the table 
is divided into two fields: pending priorities (byte-offset 0 through 3) and pending interrupts (byte- 
offset 4 through 35). 
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Figure 17. Interrupt Table 


The pending priorities field contains a 32-bit string in which each bit represents an interrupt priority. 
The bit number in the string represents the priority number. When the processor posts a pending 
interrupt in the interrupt table, the bit corresponding 
to the interrupt's 
priority is set. For example, 
if an interrupt with a priority of 10 is posted in the interrupt table, bit 10 is set. 


The pending interrupt field contains a 256-bit string in which each bit represents an interrupt vector. 
For example, byte-offset 4 is reserved, byte-offset 5 is for vectors 8 through 15, byte-offset 6 is for 
vectors 16 through 23, and so on. When a pending interrupt is logged, its corresponding 
bit in the 
pending interrupt field is set. 
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This encoding of the pending priority and pending interrupt fields permits the processor to first check 
if there are any pending interrupts 
with a priority greater than the current program and then to 
determine the vector number of the interrupt with the highest priority. Software should set these fields 
to 0 at initialization 
and not access these fields after that. 


Note 


Refer to the section, 
"Handling 
Pending 
Interrupts", 
later in this section 
for a description 
of the 
processor's 
pending 
interrupt 
mechanism. 


An interrupt handler is a procedure that performs a specific action that has been associated with a 
particular interrupt vector. For example, a typical job for an interrupt handler is to read a character 
from a keyboard. 


The interrupt handler procedure can be located anywhere in the address space. Each procedure must 
begin on a word boundary. 


When an interrupt-handler 
procedure 
is called, the states of the processor controls and arithmetic 
controls for the interrupted 
program are saved. However, the interrupt handler shares the other 
resources of the interrupted program, in particular the global registers and the address space. This 
sharing of resources imposes one important restriction on the interrupt handler procedures. 


The interrupt handler procedures must preserve and restore the state of any of the resources that it 
uses. For example, the processor allocates a set of local registers to the interrupt handler, just as it 
does on an explicit procedure call. If the interrupt handler needs to use the global or floating-point 
registers, however, it should save their contents before using them and restore them before returning 
from the interrupt handler. 


The interrupt stack can be located anywhere 
in the address space. The processor 
determines 
the 
location of the interrupt stack by means of a pointer in the 1M!. 


The interrupt stack has the same structure as the local procedure 
stack described 
in Section 3, 


"Procedure 
Stack". 


When the processor receives an interrupt, it handles it automatically. 
The processor takes care of 
saving the processor state, calling the interrupt -handler routine, and restoring the processor state once 
the interrupt has been serviced. Software support is not required. 


The following section describes the actions the processor takes while handling interrupts. It is not 
necessary to read this section to use the interrupt mechanism or write an interrupt handler routme. 
This discussion is provided for those readers who wish to know the details of the interrupt handling 
mechanism. 


1. 
It temporarily 
stops work on its current task, whether it is working on a program or another 
interrupt procedure. 


2. 
It reads the interrupt vector. 


3. 
It compares the priority of the vector with the processor's 
current priority. 


4. 
If the interrupt priority is higher than that of the processor, the processor services the interrupt 
immediately 
as described in the next sections. 


5. 
If the interrupt priority is equal to or less than that of the processor, 
the processor 
sets the 
appropriate 
priority bit and vector bit in pending interrupt record and continues 
work on its 
current task. 


The method that the processor uses to service an interrupt depends on the state the processor is in 
when it receives the interrupt. The following 
sections describe the interrupt handling actions for 
various states of the processor. In all of these cases, it is assumed that the interrupt priority is higher 
than that of the processor and will thus be serviced immediately after the processor receives it. The 
handling of lower priority interrupts is described later in "Pending Interrupts". 


When the processor receives an interrupt while it is in the executing stae (i.e. executing a program), 
it performs the following actions to service the interrupt; this procedure 
is the same regardless of 
whether the processor is in the user or the supervisor mode when the interrupt occurs: 


1. 
The processor saves the current state of process controls and arithmetic controls in an interrupt 
record on the stack that the processor is currently using. This stack can be the local-procedure 
stack or the supervisor stack. (The interrupt record is described in the following section.) 


2. 
If the execution of an instruction was suspended, the processor includes a resumption record for 
the instruction in the current stack and sets the resume flag in the saved process controls. (Refer 
to section 7, "Instruction 
Suspension", 
for a discussion of the criteria for suspending 
instruc- 
tions. 


3. 
The processor switches to the interrupted state. 


4. 
The processor sets the state flag in the process controls to interrupted, 
its execution mode to 
supervisor, and its priority to the priority of the interrupt. Setting the processor priority to that 


of the interrupt insures that lower priority interrupts can not interrupt the servicing ofthe current 
interrupt. 


5. 
Also in its internal process controls, the processor 
clears the trace-fault-pending 
and trace- 
enable flags. Clearing these flags allows the interrupt to be handled without trace faults being 
raised. 


6. 
The processor allocates a new frame on the interrupt stack and switches to the interrupt stack. 


7. 
The processor sets the frame return status field (associated with the PFP) to 1112, 


8. 
The processor performs an implicit call-extended 
operation (similar to that performed for the 
calix instruction). 
The address for the procedure taht is called is that which is specified in the 
interrupt table for the specified interrupt vector. 


Once the processor has completed the interrupt procedure, it performs the following action on the 
return: 


1. 
The processor deallocates the stack frame from the interrupt stack and switches to the local or 
supervisor stack (whichever one it was using when it was interrupted). 


2. 
The processor copies the arithmetic controls field from the interrupt record into its arithmetic 
controls register. 


3. 
The processor copies the process controls field from the interrupt record into its internal process 
controls. 


4. 
If the resume flag of the process controls is set, the processor copies the resumption record from 
the interrupt record to the resumption 
record field of the PRCB. 


5. 
The processor checks the interrupt table for pending interrupts that are higher then the priority 
of the program being returned to. If a higher-priority 
pending interrupt is found, it is handled as 


if the interrupt occurred at this point. 


6. 
Assuming 
that there are not pending interrupts to be serviced, the processor 
switches to the 


executing state and resumes work on the program. 


If the processor receives an interrupt while it is servicing an interrupt, and the new interrupt has a 
higher priority than the interrupt currently being serviced, the current interrupt-handler 
routine is 


interrupted. Here the processor performs the same action to save the state of the interrupted interrupt- 
handler routine as is described at the beginning of this section. Here, the interrupt record is saved on 
the top of the interrupt stack, prior to the new frame that is created for use in servicing the new 
interrupt. 


The processor saves the state of an interrupted program (or interrupt-handler) 
routine in an interrupt 
record. Figure 18 shows the structure of this interrupt record. 
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The resumption record within the interrupt record is used to save the state of a suspended instruction. 
If no instruction 
is suspended, the resumption record is not created. 


The processor 
can also be interrupted 
while in the stopped 
state. The processor 
handles 
such 
interrupts in essentially the same way that it handles interrupts that occur while the processor is in 
the executing state, with the following exception. When the processor allocates the new frame on the 
interrupt stack, it sets the frame return field to 1102. This causes the processor to revert to the stopped 
state when the processor returns from the interrupt-handler 
procedure. 


If the processor 
receives 
an interrupt while it is in the stopped-interrupted 
state, it handles the 
interrupt just as it would if it occurred in the interrupted state. 


As was described earlier in this section, the processor provides a mechanism for evaluating interrupts 
according to their priority. If the interrupt priority is equal to or lower than the processor's 
current 
priority, the processor does not service the interrupt immediately. Instead, it posts the interrupt in the 
pending interrupt section of the interrupt table. The processor checks the interrupt table at specific 
times and services those interrupts that have a higher priority than its current priority. This pending 
interrupt mechanism provides two benefits. 


1. 
The ability to delay the servicing of low priority interrupts (by posting them in the pending 
interrupt section of the interrupt table) allows the processorto 
concentrate its processing activity 
on higher priority tasks. 


2. 
In a system that uses two or more 80960KB processors, 
both processors 
can share the same 
interrupt table. This interrupt-table 
sharing allows the processors to share the interrupt handling 
load. 


Note 


The 80960 architecture 
defines 
the seclion 
of the interrupt 
table for storing pending 
interrupts 
and a 
mechanism 
for checking 
the interrupt 
table for pending 
interrupts. 
The method 
used for posting 
interrupts 


to the interrupt 
table and circumstances 
under which the processor 
check the interrupt 
table for pending 
interrupts 
is not defined. 


In the following 
description 
of the pending 
interrupt 
mechanism, 
the information 
given in the sections 
titled "Posting 
Pending 
Interrupts" 
and "Checking 
for Pending 
Interrupts" 
is specific 
to the 80960KB 
processor. 
The information 
given in the section 
titled "Handling 
Pending 
Interrupts" 
is defined 
in the 
80960 architecture 
and should be common 
in all processors 
that implement 
this part of the architecture. 
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An interrupt can be posted in the pending-interrupt 
record of the interrupt table in either of the 
following two ways: 


I. 
The processor receives an interrupt with a priority equal to or lower than that of the program the 
processor is currently working on. The processor then automatically 
posts the interrupt in the 
pending-interrupt 
record. 


2. 
The kernel can set the desired pending-interrupt 
and pending-priority 
bits in the interrupt table. 


Using the first method, the processor performs an atomic read/write operation that locks the interrupt 
table until the posting operation 
has been completed. 
Locking the interrupt table prevents other 
agents on the bus from accessing the interrupt table during this time. 


The second method of posting an interrupt is risky, because it does not use this locking technique. 
(The processor's 
atomic instructions are not able to perform a locking operation that spans several 
instructions.) 
This method will work only if the kernel can insure the following: 


that no external I/O agent will attempt to post a pending interrupt 
simultaneously 
with the 
processor, and 


that an interrupt cannot occur after one bit (e.g. the pending priority bit) of the pending-interrupt 
record is set but before the other bit (the pending interrupt vecor) is set. 


After returning from an interrupt-handler 
procedure 


While executing a modify-process-controls 
instruction 
(modpc), 
if the instruction 
causes the 
program's 
priority to be lowered. 


After receiving a test pending interrupts lAC message. 


The processor 
uses the same type of atomic read/write 
operation to check the interrupt table for 
pending interrupts as it does for posting pending interrupts. Again, this technique prevents other 
agents on the bus from accessing 
the interrupt table until the pending-interrupt 
check has been 
completed. 


When the processor finds a pending interrupt, it handles it as if it had just received the interrupt. The 
handling mechanism is the same as is described earlier in this chapter for interrupts that are serviced 
as soon as they are received. 
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If the processor finds two pending interrupts at the same priority, it services the interrupt with the 
highest vector number first. 


Note 


The 80960 architecture 
does not define a mechanism 
for signaling 
interrupts 
to the processor. 
The 
methods 
of signaling 
interrupts 
described 
in the following 
section 
are specific 
to the 80960KB 
processor. 


Signal on its interrupt pins 


Singal on its interrupt pins from an external interrupt controller 


An lAC message from external source 


An lAC message from a program in the processor 


A pending interrupt (described earlier in this chapter) 


The processor 
has four interrupt pins, called INTO, INTI, 
INT2, and INT3. These pins can be 


configured 
in either of the following three ways: 


as four interrupt-signal 
inputs; 


as two interrupt inputs and two pins for handshaking with an interrupt controller such as the Intel 
8259A Programmable 
Interrupt Controller; or 


as one lAC input and three interrupt inputs. 


A 32-bit, interrupt-control 
register in the processor 
determines 
how these pins are used. Each 
interrupt pin is associated with one 8-bit field in the register, as shown in Figure 19. 


If the interrupt pins are to be used as four inputs, a different interrupt vector is stored in each of the 
four fields in the interrupt -control register. Then, when an interrupt is signaled on one of the pins, the 
processor reads the vector from the pin's associated field in the register. For example, if an interrupt 
is signaled on pin INTO, the processor reads the vector from bits 0 through 7. 


The processor assumes that the interrupt vectors in the interrupt register are arranged in descending 
order from the INTO field to the INT3 field (e.g., the priority of INTO ~ INTI ~ INT2 ~ INn). 
To 
insure that interrupts are handled in the proper order, software should follow this convention. 


If the INTO vector field is set to 0, the function of the INTO pin is changed to lAC, and it is used to 
signal the processor that an external lAC message has been sent to it. In fact, the INTO pin must be 
configured 
in this manner for the processor to service external lAC messages. 


If the INT2 vector field is set to 0, the functions of the INT2 and INT3 pins are changed to INTR and 
INTA, respectively. Here, the INTR pin is used to receive signals from an interrupt controller and the 
INTA pin is used to send acknowledge 
signals back to the controller. When the processor receives 


a signal on the INTR pin, it reads an interrupt vector from the least significant 8 bits of its bus, then 
sends an acknowledge 
signal to the controller through INTA. When the INT2 and INT3 pins are 
configured 
in this manner, the processor ignores the INT3 vector field. 


Note 


Refer to the 80960KB 
Hardware 
Designer's 
Reference 
Manual 
for more information 
on the use of INT2 


and INTI 
pins with an interrupt 
controller. 


The interrupt-control 
register is memory mapped to addresses 
FF00000416 
through FF000007 
t6' 
Only the processor can read or write this register using the synchronous load (synld) and synchronous 
move (synmov) 
instructions. 
External agents on the bus cannot access this register. 


The processor 
can also receive an interrupt request by means of the lAC mechanism. 
(The lAC 
mechanism 
is described 
in detail in Chapter 
13.) The interrupt lAC message can be sent to the 
processor either from an external bus agent, such as an I/O processor or another 80960KB processor, 
or internally as part of the currently running program. The interrupt vector is contained in the interrupt 
lAC message. 


As with any other lAC message, the processor receives notice of an external interrupt-lAC 
message 
through the INTO pin, which has been configured as an lAC pin, as described in the previous section. 
The processor then reads the lAC message to get the interrupt vector. 


A program 
running on the processor 
can signal an interrupt through an internal interrupt-lAC 
message. An internal lAC is sent to the processor by means of a synchronous move instruction. When 
the processor executes a synchronous 
move to its lAC message space, it signals an lAC message 
internally. The processor then reads the lAC message as it would for an external lAC. 


This section describes the fault handling facilities of the 80960KB processor. The subjects covered 
include the fault-handling 
data structures, the software support required for fault handling, and the 


fault handling mechanism. A reference section that contains detailed information on each fault type 
is provided at the end of the section. 


The processor 
is able to detect various conditions 
in code or in its internal state (called "fault 
conditions") 
that could cause the processor to deliver incorrect to inappropriate 
results or that could 
cause it to head down an undesirable control path. For example, the processor recognizes divide-by- 
zero and overflow conditions on integer calculations. 
It also detects inappropriate 
operand values, 


uncompleted 
memory aCyesses, or references to incomplete or non-existent 
system-data structures. 


The processor can detect a fault while it is executing a program, an interrupt handler, or a fault handler. 
(In this section, when a program is referred to, it generally also means any interrupt handler or fault 
handler that may have been invoked while the processor was working on the program.) 


When the processor detects a fault, it handles the fault immediately and independently of the program 
or handler it is currently working on, using a mechanism 
similar to that used to service interrupts. 


A fault is generally 
handled with a fault-handling 
procedure 
(called a fault handler), 
which the 
processor invokes through an implicit procedure call. Prior to making the call, the processor saves 
the state ofthe current program and in some cases the state of an incomplete instruction. It also saves 
information about the faults, which the fault handler can use to correct or recover from the condition 
that caused the fault. 


If the fault handler is able to recover from the fault, the processor can then restore the program to its 
state prior to the fault and resume work on the program. If the fault handler is not able to recover from 
the fault, it can take any of several actions to gracefully shut down the processor. 


All of the faults that the processor detects are predefined. These faults are divided into types and 
subtypes, each of which is given a number. The processor uses the type number to select a fault 
handler. The fault handler then uses the subtype number to select a specific fault-handling 
procedure. 


Table 14 lists the faults that the processor detects, arranged by type and subtype. For convenience, 
individual faults are referred to in this chapter by their fault-subtype 
name. Thus a machine bad- 
access fault is referred to as simply a bad-access fault, or an arithmetic 
integer overflow fault is 


referred to as an integer overflow fault. 


The fifth column of Table 14 shows each fault as it appears in the fault record (the word at offset 40 
of the fault record is shown later in this section). 


Fault Type 
Fault Subtype 
Fault Record 


No./Bit 
No. 
Name 
Position 
Name 


1 
Trace 
Bit 1 
Instruction Trace 
OxXXOl0002 
Bit 2 
Branch Trace 
OxXXOlOOO4 
Bit 3 
Call Trace 
OxXXOl0008 
Bit4 
Return Trace 
OxXXOloolO 
Bit 5 
Prereturn Trace 
OxXXOl0020 
Bit 6 
Supervisor Trace 
OxXXOl0040 
Bit 7 
Breakpoint Trace 
OxXXOl0080 


2 
Operation 
1 
Invalid Opcode 
OxXX020001 
2 
Unimplemented 
OxXX020oo2 
4 
Invalid Operand 
OxXX020004 


3 
Arithmetic 
I 
Integer Overflow 
OxXX030001 
2 
Arithmetic Zero-Divide 
OxXX03 0002 


4 
Floating 
BitO 
Floating Overflow 
OxXX04 0001 
Point 
Bit 1 
Floating Underflow 
OxXX040002 
Bit 2 
Floating Invalid-Operation 
OxXX040004 
. 


Bit 3 
Floating Zero-Divide 
OxXX040008 
Bit4 
Floating Inexact 
OxXX04 0010 
Bit 5 
Floating Reserved-Encoding 
OxXX04oo20 


5 
Constraint 
I 
Constraint Range 
OxXX050001 


2 
Privileged 
OxXX050002 


7 
Protection 
Bit I 
Length 
, 


OxXX070001 


8 
Machine 
I 
Bad Access 
OxXX080001 


9 
Structural 
3 
lAC 
OxXX090003 


Note 


The 80960 architecture 
defines 
a basic set of fault types and subtypes. 
Processors 
that provide 
extensions 


to the architecture 
!'lay recognize 
additional 
fault conditions. 
The encoding 
of fault types and subtypes 


allows any of these extensions 
to be included 
in the fault table along with the basic faults. Space in the 


fault table will be reserved 
in such a way that processors 
thaI recognize 
the same fault types and subtypes 


will 
encode 
them 
in the same way. 


For example, 
the floating-point 
faults (fault type 4) are an extension 
provided 
in the 80960KB 
processor 
(but not in the 80960KA 
processor). 
Any other processors 
based on the 80960 architecture 
that also 


recognize 
floating-point 
faults will also encode 
them as fault type 4. 
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The processor handles all faults through an implicit procedure call to a fault handler. When a fault 
occurs while the processor is executing a program, the processor creates a fault record on its current 
stack. This record includes information on the state of the program and data on the fault. If the fault 
occurred while the processor was in the midst of executing an instruction, a resumption record for 
the instruction may also be saved on the stack. 


Following the creation of the fault and resumption records, the processor selects a fault handler from 
a system-data structure called thefault table. It then invokes the fault handler (by means of an implicit 
call) and begins executing the handler procedure. As is described later in this section, the fault handler 
call can be a local call (call-extended 
operation), a local system-procedure-table 
call (local system- 
call operation), or a supervisor call. 


This same procedure call method is used to handle faults that occur while the processor is servicing 
an interrupt or that occur while the processor is working on a fault handler. 


It is possible for multiple fault conditions to occur simultaneously. 
For certain fault types, such as 
trace faults or protection 
faults, bit positions 
in the fault-subtype 
field are used to indicate the 
occurrence 
of multiple faults of the same type. As a general rule, however, the processor does not 
indicate situations where multiple faults occur. Instead, it records one of the faults and does not report 
on the faults that were not recorded. 


If a fault occurs while the processor 
is executing 
a fault handling 
routine, the operating 
of the 
processor is not predictable. 


If an interrupt occurs during an instruction that will fault, that has just faulted, or that has faulted while 
the processor is in the midst of selecting the fault handler, the processor will handle the fault in either 
of the following ways: 


It includes 
the fault information 
as part of its interrupt 
record and services 
the interrupt 
immediately. After it has serviced the interrupt, it handles the fault. 


It completes the selection of the fault handler, then services the interrupt just prior to executing 
the first instruction of the fault handler. 


To use the processor's 
fault-handling 
facilities, the following system-data structures and procedures 
must be present in memory: 


Fault Table 


Fault-Handler 
Procedures 


Interrupt Table 


Interrupt Stack 


Software should generally load these items in memory as part of the initialization 
procedure. Once 


they are present in memory and pointers to them have been included in the IMI, the processor then 
handles faults automatically 
and independently 
from software. 


The fault table provides the processor with a pathway to the fault handlers when the processor is using 
the implicit procedure-cal 
method of handling faults. As shown in Figure 20, there is one entry in the 
fault table for each fault type. When a fault occurs, the processor uses the fault type to select an entry 
in the fault table. From this entry, the processor then obtains a pointer to the fault handler for the type 
of fault that occurred. 


The fault handler 
reads the fault subtype 
or subtypes 
from the fault record 
to determine 
the 


appropriate fault recovery action. 


The fault table can be located anywhere in the address space. The processor obtains a pointer to the 
fault table from the 1M!. 


Each entry in the fault table is two words long. As shown in Figure 20, there are two types of fault- 
table entries allowed: local-procedure 
entry and system-procedure-table 
entry. The entry-type field 
determines 
the entry type. 


A local-procedure 
entry (entry type 002) provides an instruction pointer (address in the address space) 


for the fault handler procedure. Using this e'ntry, the processor invokes the specified fault handler by 
means of an implicit call-extended operation (similar to that performed for the calIx instruction). The 
second word of a local-procedure 
entry is reserved. It should be set to zero when the fault table is 


creted and not accessed after that. 


A system-procedure-table 
entry (entry type 
102) provides 
a procedure 
number 
in the system 
procedure table. Using this entry, the processor invokes the specified fault handler by means of an 
implicit call-system 
operation (similar to that performed for the calls instruction). 


inter 


Fault-handling 
procedures 
in the system procedure 
table can be local procedures 
or supervisor 
procedures. A fault handler can thus be invoked through the fault table in any of three ways: implicit 
local-procedure 
call, implicit local procedure-table 
call, or implicit supervisor call. 


31 
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0 
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16 
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24 
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The fault-handler 
procedures 
can be located anywhere in the address space. Each procedure must 
begin on a word boundary. 


The processor can execute the procedure in the user mode or the supervisor mode, depending on the 
type of fault table entry. 


To resume 
work on a program 
at the point where a fault occurred 
(following 
the recover 
action of the 
fault handler), 
the fault handler 
must be executed 
in the supervisor 
mode. The reason for this requirement 


is described 
in "Program 
and Instruction 
Resumption 
Following 
a Fault" in this section. 


Many of the faults that occur can be recovered from easily. When recovery from the fault is possible, 
the processor's 
fault-handling 
mechanism allows the processor to automatically resume work on the 
program or interrupt that it was working on when the fault occurred. The resumption action is initiated 
with a ret instruction in the fault-handler 
procedure. 


If recovery from the fault is not possible or not desirable, 
the fault handler can take one of the 
following actions, depending on the nature and severity of the fault condition (or conditions, 
in the 
case of multiple faults): 


Return to a point in the program or interrupt code other than the point of the fault 


Save the current state of the processor and call a debug monitor 


Save the current state of the processor and place the processor in the stopped state (using freeze 
lAC) 


Explicitly write the processor state, fault record, and instruction resumption record into memory 
and place the processor in the stopped state 


Place the processor in the stopped state without explicitly saving the processor state or the fault 
information. 


When working with the processor at the development 
level, a common action of the fault handler is 


to save the fault and processor state information 
and make a call to a debugging device such as a 


debugging monitor. This device can then be used to analyze the fault. 


The processor allows work on a program to be resumed at the point where the fault occurred following 
a return from a fault handler. If an instruction 
was suspended to handle the fault execution of the 
instruction can also be resumed on the return. 


This resumption mechanism is similar to that provided by returing from an interrupt handler. It is only 
useful, however, for faults from which recovery is possible, such as the trace faults. 


To use this mechanism, 
the fault handler must be invoked using an implicit supervisor procedure- 
table call. This method 
is required 
because 
to resume 
work on the program 
and a suspended 
instruction at the point where the fault occurred, the saved process controls in the fault record must 
be copied back into the processor on the return from the fault handler. The processor only performs 
this action if the processor is in the supervisor mode on the return. 


If the fault handler is invoked with an implicit local-procedure 
call or an implicit local-procedure- 
table call, the return IP determines 
where in the program the processor resumes work, following a 
return from a fault handler. Here, the return is handled in a similar manner to a return from an explicit 
call with a call or calix instruction. 


The return IP (referred to later in this section as the saved IP) is saved in the ~IP register (r2) of the 
stack frame that was in use when the fault occurred. This IP may be the instruction the processor 
faulted on or the next instruction tht the processor would have executed if the fault had not occurred. 
In either case, the resumption record is not used, so the processor might continue work on the program 
without completing 
the instruction that the fault occurred on. 


A fault handler should thus be invoked with an implicit local-procedure 
or local-procedure-table 
call 
only if it is not required or desirable to resume the program at the point of the fault. The section, 
"Return Without Resumption", 
discusses returning to a point in the program code other than the point 
of the fault. 


Certain fault types and subtypes have masks or flags associated with them that determine whether 
or not a fault is signalled when a fault condition occurs. Table 15 lists these flags and masks, the 
system data structures in which they are located, and the fault subtype they affect. 


Flag or Mask Name 
Location 
Fault Affected 


Integer Overflow Mask 
Arithmetic Controls 
Integer Overflow 


Floating Overflow Mask 
Arithmetic Controls 
Floating Overflow 


Floating Underflow 
Mask 
Arithmetic Controls 
Floating Underflow 


Floating Invalid Operation Mask 
Arithmetic Controls 
Floating 
Invalid Operation 


Floating Zero-Divide 
Mask 
Arithmetic Controls 
Floating Zero-Divide 


Floating-point 
Inexact Mask 
Arithmetic Controls 
Floating Inexact 


No Imprecise Faults Flag 
Arithmetic Controls 
All Imprecise Faults 


Trace-Enable 
Flag 
Process Controls 
All Trace Faults 


Trace-Mode 
Flags 
Trace Controls 
All Trace Faults 
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The integer and float-point mask bits inhibit faults from being raised for specific fault conditions (i.e., 
integer overflow and floating-point overflow, underflow, zero divide, invalid operation, and inexact). 
The use of these masks is discussed in the fault -reference section at the end of this section. Also, the 
floating-point 
fault masks are described in Chapter 11 in "Exceptions 
and Fault Handling". 


The no-imprecise-faults 
(NIF) flag controls the synchronizing 
offaults for a category offaults called 
imprecise faults. This flag should be set to 1. The function of this flag is described later in the section 
"Precise and Imprecise Faults". 


The trace-mode flags (in the trace controls) and trace-enable flag (in the processor controls) support 
trace faults. The trace-mode flags enable trace modes; the trace-enable 
flag enables the generation 
of trace faults. The use of these flags is described in the fault reference section on trace faults. Further 
discussion of these flags is provided in Section 9, "Trace-Enable 
and Trace-Fault-Pending 
Flags". 


The processor generates faults implicitly when fault conditions occur and explicitly at the request of 
software. Most faults are generated implicitl y.The fault control bits described in the previous section 
allow the implicit generation of some faults to be either enabled (as with the trace faults) or masked 
(as with the floating-point 
faults). 


The fault-if instructions (faulte, faultne, 
faultl, faultle, faultg, faultge, faulto, and faultno) 
allow 
a fault to be generated explicitly anywhere within an application program, kernel procedure, interrupt 
handler, or fault handler. When one of these instructions 
is executed, 
the processor 
checks the 
condition code bits in the arithmetic controls, then signals a constraint-range 
fault if the condition 
specified with the instruction is met. 


When a fault occurs, the processor records information 
about the fault in a fault record. The fault 
handler and processor use this information to recover from or correct the fault condition and resume 
execution of the process. Figure 21 shows the structure of the fault record. The use of the fields in 
this record are described in the following paragraphs. 


The type number (byte ordinal) of a fault is stored in the fault-type field; the subtype number or bit 
positions (byte ordinal) is stored in the fault-subtype 
field. 


The fault-flags 
field provides 
a set of general-purpose 
flags that the processor 
uses to indicate 
additional information about a particular fault subtype. Most of the faults do not use these flags, in 
which case the flags have no defined values. 


The address-of-the-faulting-instruction 
field contains the IF of the instruction that caused the fault 
or that was being executed when the fault occurred. 
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The states of the process controls and arithmetic controls at the time that a fault is generated are stored 
in their respective fields in the fault record. This information is used to resume work on the program 
after the fault has been handled. 


Finally, a three-word fault data field is provided for the fault. The information that is stored in these 
fields depends on the type of fault that occurs. Any part of a fault-data field that is not used for a 
particular fault has no defined value. The information that is stored in these fields for each fault type 
is given in the fault reference section at the end of this section. 


The saved IP (the RIP that is saved in r2 of the stack frame in use when the fault occurred) is also 
part of the fault information 
that the processor saves when a fault occurs. This IP generally points 
to the next instruction that the processor would have executed if the fault had not occurred, although 
it may point to the faulting instruction. It is this instruction that the processor begins working on when 
the return from the fault handler is initiated. 


If the processor suspends an instruction as the result of a fault, it creates a 48-byte resumption record. 
The criteria that the processor uses to determine whether or not to suspend an instruction 
and the 
structure of the resumption 
record are the same as are used when an interrupt occurs. 


The fault and resumption 
records are stored in the stack that the processor is using when the fault 
occurs. This stack can be the local stack, the supervisor stack, or the interrupt stack. 


Once a fault has occurred, the processor saves the program state, calls the fault handler, and restores 
the program state (if this is possible) once the fault recovery action has been completed. No software 
other than the fault-handler 
procedures 
is required to support this activity. 


Three different types of implicit procedure calls can be used to invoke the fault handler according 
to the information in the selected fault-table entry: local call, local call through the system procedure 
table, and supervisor call (also through the system procedure table). 


When the selected fault-handler 
entry in the fault table is an entry type 002 (local procedure) 
the 


processor performs the following action: 


1. 
The processor stores a fault record as shown in Figure 21 on the top of the stack that the processor 
is currently using. The stack can be the local stack, the supervisor stack, or the interrupt stack. 


2. 
If the fault caused 
an instruction 
to be suspended, 
the processor 
inciudes 
an instruction 
resumption 
record on the current stack and sets the resume flag in the save process controls. 


3. 
The processor creates a new frame on the current stack, with the frame-return 
status field set to 


0012, 


4. 
Using the prpcedure 
address from the selected fault-table 
entry, the processor 
performs 
an 
implicit call-extended 
operation to the fault handler. 


If the fault handler is not able to perform a recovery action, it performs one of the actions described 
under "Possible Fault-Handler 
Actions". 


If the handler action results in a recovery from the fault, a ret instruction in the fault handler allows 
processor control to return to the program that was being worked on when the fault occurred. On the 
return, the processor performs the following action: 


1. 
The processor deallocates 
the stack frame created for the fault handler. 


2. 
The processor 
copies the arithmetic 
controls field from the fault record into the arithmetic 


controls register in the processor. 


3. 
The processor then resumes work on the program it was working on when the fault occured at 
the instruction in the return IP register. 
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When the fault-handler 
entry selects an entry in the system procedure table (entry type 102) and the 
system-procedure-table 
entry is for local procedure, 
the processor performs the same action as is 
described in the previous section for a local procedure call/return. The only difference 
is that the 
processor gets the address of the fault handler from the system procedure table rather than from the 
fault table. 


When the fault-handler 
entry ·selects an entry in the system procedure table (entry type 102) and the 
system-procedure-table 
entry is for a supervisor procedure, the processor performs the same action 
as is described 
in the previous section for a local procedure 
call and return, with the exceptions 


described in the following paragraphs. 


1. 
If the processor is in user mode when the fault occurs, the fault record and resumption 
record 
are stored in the local stack. The processor then takes the stack pointer from the procedure table 
and switches to the supervisor stack. The execution mode is then set to supervisor. 


2. 
If the processor is already in supervisor mode when the fault occurs, the fault record is stored 
in the current stack (which is the supervisor stack). The processor then creates a new frame on 
the current stack and begins work on the fault-handler 
procedure selected from the procedure 
table. 


3. 
In both of the above cases, the processor copies the state of the trace-control 
flag (byte 2, bit 1) 


of the procedure table into the trace-enable 
flag field of the process controls. 


1. 
If the processor 
is in supervisor mode prior to the return from the fault handler (which it should 
be), it copies the saved process controls into its internal process controls. 


2. 
If the resume flag of the process controls is set, the processor reads the resumption record from 
the stack. 


3. 
The processor then resumes work on the program at the point it was working on when the fault 
occurred. 


The restoration of the process controls causes any changes in the process controls through the action 
of the fault handler to be lost. In particular, if the ret instruction from the fault handler caused the 
trace-fault-pending 
flag in the process controls to be set, this setting would be lost on the return. 


As has been described earlier in this section, faults can occur prior to the execution of the faulting 
instruction (i.e., the instruction that causes the fault), during the instruction, or after the instruction. 
When the fault occurs before the faulting instruction is executed, the instruction can theoretically be 
executed on the return from the fault handler. So, the fault is not accompanied 
by a change in the 
control flow of the program. 


When a fault occurs during or after the instruction that caused a fault, the fault may be accompanied 
by a change in the program's 
control flow such that the faulting instruction cannot be reexecuted. 
For example, when an integer-overflow 
fault occurs, the overflow value is stored in the destination. 


If the destination register was the same as one of the source registers, the source value is lost, making 
it impossible to reexecute the faulting instruction. 


In general, changes in the program's 
control flow never accompany 
the following 
fault types or 
subtypes: 


All Operation Subtypes 


Arithmetic Zero-Divide 


All Floating-Point 
Subtypes Except Floating Inexact 


All Constraint Subtypes 


Pre return Trace 


All Trace Subtypes Except Preretum Trace 


Integer Overflow 


Floating Inexact 


Changes in the program's 
control flow mayor 
may not accompany 
the following fault types and 
subtypes: 


Structural 


Bad Access 


The effect that specific fault types have on a program is given in the fault reference section at the end 
of this section under the heading "Program State Changes." 


There may be situations where the fault handler needs to return to a point in the program other than 
where the fault occurred. This can be done by altering the return IP in the previous frame. However, 
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if resumption 
information was collected with the fault (resulting in the resume flag being set in the 
saved process controls), such a return can cause unpredictable 
results. 


To predictably perform a return from a fault handler to an alternate point in the program, the fault 
handler should clear the following information in the process-controls 
field of the fault record before 
the return: the resume and trace-fault-pending 
flags; the internal state field. 


Note 


A return of this type can only be performed 
if the processor 
is in supervisor 
mode prior to the return. 


As described 
in the Section 2, "Register Scoreboarding," 
the 80960KB 
processor 
is, in some 
instances, 
able to execute 
instructions 
concurrently. 
When two instructions 
are being executed 
concurrently, 
it is possible for them to generate faults simultaneously. 
When this occurs, one of the 
faults may not be signaled or may be signaled out of order, making it impossible to recover from that 
fault. 


The processor provides two mechanisms to allow the circumstances 
under which faults are signaled 
to be controlled. 
These mechanisms 
are the no imprecise faults flag (NIF flag) in the arithmetic 
controls and the synchronize faults instruction (syncf). The following paragraphs 
describe how these 
mechanisms 
can be used. 


Precise faults are those that are intended to be recoverable by software. For any instruction that can 
generate a precise fault, the processor 
will (I) not execute the instruction 
if an unfinished 
prior 
instruction will fault and (2) not execute subsequent 
out-of-order 
instructions 
that will fault. The 
following faults are always precise: 


trace 


protection 


Imprecise faults are those that in some instances are allowed to occur and not be signaled or be 
signaled out of order. These faults include the following: 


operation 


arithmetic 


floating point 


constraint 


Asynchronous 
faults are those whose occurrence has no direct relationship to the instruction pointer. 
This category includes the machine fault. 


The NIF flag controls whether ornot imprecise faults are allowed. When this flag is set, all faults must 
be precise. In this mode, the ability to execute instructions concurrently 
is essentially disabled. All 
faults that occur are signaled. 


When the NIF flag is clear, faults in the imprecise-category 
can in some instances occur and not be 
signaled. In this mode, the following conditions hold true: 


1. 
When an imprecise 
fault occurs, the saved IP is undefined 
(but the address of the faulting 
instruction in the fault record is valid) 


2. 
If instructions are executed concurrently 
when an imprecise fault occurs, the results produced 
by these instructions 
are undefined. 


3. 
If instructions 
are executed out-of-order 
and multiple imprecise faults occur, only one of the 
faults is generated. The one that is selected is not predictable. 


The syncf instruction forces the processor to complete execution of all instructions that occur prior 
to the syncfinstruction 
and to generate all faults, before it begins work on instructions that occur after 
the syncf instruction. This instruction has two uses. One use is to force faults to be precise when the 
NIF is clear. The other use is to insure that all instructions are complete and all faults signaled in one 
block of code before execution of another block of code (for example, on Ada block boundaries when 
the blocks have different exception handlers). 


The intent of these fault-generating 
modes is that compiled code should execute with the NIF clear, 
using the syncf instruction 
where necessary to ensure that faults occur in order. In this mode, 
imprecise faults are considered 
as catastrophic 
errors from which recovery is not needed. 


If recovery from one or more of the imprecise faults is required (for example, a program that needs 
to handle unmasked floating-point 
exceptions and recover from them) and the fault handler cannot 
be closely coupled with the application to perform recovery even if the faults are imprecise, the NIF 
should be set. Executing with the NIF set will likely lead to slower execution times. 


This section describes each of the fault types and subtypes and gives detailed information about what 
is stored in the various fields of the fault record. The section is organized alphabetically 
by fault type. 
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The fault-type section gives the number entered in the fault-type field of the fault record for the given 
fault type. The fault-subtype 
section lists the fault subtypes and their associated 
number or bit 
position in the fault-subtype 
field of the fault record. 


The function section gives a general description of the purpose of the fault type, then describes the 
purpose of each of the fault subtypes in detail. It also describes how the processor handles each fault 
subtype. 


The fault record section describes how the flags, fault-data, and address-of-faulting-instruction 
fields 
of the fault record are used for the fault type and subtypes. 


The saved IP section describes what value is saved in the RIP register (r2) of the stack frame the 
processor was using when the fault occurred. 


The program state changes section describes the effects that the fault subtypes have on the control 
flow of a -program. 
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Fault Type: 


Fault Subtype: 


316 


Number 


o 
I 
2 
3-F 


Reserved 
Integer Overflow 
Arithmetic Zero-Divide 
Reserved 


Indicates that there is a problem with an operand or the result of an 
arithmetic 
instruction. 
This fault type applies only to ordinal and 
integer instruction, 
not floating-point 
instructions. 


The integer-overflow 
fault occurs when the result of an integer in- 


struction overflows the destination 
al)d the integer-overflow 
mask in 
the arithmetic-controls 
register 
is cleared. 
Here, the n least sig- 
nificant bits of the result are stored in the destination, 
where n is the 
destination 
size. 


The arithmetic zero-divide 
fault occurs when the divisor operand of 
an ordinal or integer divide instruction is zero. 


Flags: 
Not used. 


Fault Data: 
Not used. 


Addr. Fault. 
lost.: 
IP for the instruction 
on which the processor 
faulted. 


IP for the instruction 
that would have been executed 
next, if the 
fault had not occurred. 


A change in the program's 
control flow accompanies 
the integer- 
overflow fault, because the result is stored in the destination 
before 
the fault 
is signaled. 
The faulting 
instruction 
can thus not be 


reexecuted. 


A change 
in the program's 
control flow does not accompany 
the 
arithmetic 
zero-divide 
fault, because the fault occurs before the ex- 
ecution of the faulting instruction. 
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Fault Type: 


Fault Subtype: 


Saved IP: 


Prog. State Changes: 


516 


Number 


o 
1 
2-F 


Reserved 
Constraint Range 
Reserved 


Indicates 
that the processor 
is either in or not in the required 
state 


for the instruction to be executed. 


The constraint-range 
fault occurs when a fault-if instruction 
is ex- 


ecuted and the condition code in the arithmetic controls matches the 
condition required by the instruction. 


Flags: 
Not used. 


Fault Data: 
Not used. 


Addr. Fault. 
Inst.: 
IP for the instruction 
on which the processor 


faulted 


Not used. 


No 
changes 
in 
the 
program's 
control 
flow 
accompany 
the 


constraint-range 
fault. 
This fault occurs after the fault-if instruction 


has been executed, but the instruction 
has no effect on the program 
state. 
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Floating-Point 
Faults 


Fault Type: 


Fault Subtype: 


416 


Bit Number 


Bit 0 
Bit 1 
Bit 2 
Bit 3 
Bit4 
Bit 5 
Bits 6 and 7 


Floating Overflow 
Floating Underflow 
Floating Invalid-Operation 
Floating Zero-Divide 
Floating Inexact 
Floating Reserved-Encoding 
Reserved 


Indicates that there is a problem with an operand or the result of a 
floating-point 
instruction. 
Each floating-point 
fault is assigned a bit 
in the fault-subtype 
field. 
Multiple 
floating-point 
faults can only 
occur 
simultaneously, 
however, 
with 
the 
floating-overflow, 
floating-underflow, 
and floating-inexact 
faults. 


The floating-point 
faults are described 
in detail 
in the section 
in 
Chapter 
12 titled "Exceptions 
and Fault Handling." 
The following 
paragraphs 
give a brief description of each floating-point 
fault. 


A floating-overflow 
fault occurs when (I) the floating-point 
over- 
flow mask is clear and (2) the infinitely precise result of a floating- 
point instruction 
exceeds 
the largest allowable 
finite value for the 
specified destination 
format. 
This fault interacts with the floating- 
inexact fault (as described in Chapter 
12). 


A floating-underflow 
fault occurs when (I) the floating-point 
under- 
flow mask is clear and (7) the infinitely precise result of a floating- 
point instruction 
is less than the smallest possible normalized, 
finite 
value for the specified destination 
format. 
This fault interacts with 
the floating-inexact 
fault (as described in Chapter 12). 


The floating 
invalid-operation 
fault occurs 
when (I) the floating- 
point 
invalid-operation 
mask 
is clear 
and (2) one of the source 
operands for a floating-point 
instruction is inappropriate 
for the type 
of operation being performed. 


The floating 
zero-divide 
fault occurs 
when (I) the floating-point 
zero-divide 
mask is clear and (2) the divisor operand of a floating- 
point divide instruction is zero. 


The floating-inexact 
fault occurs when (1) the floating-point 
inexact 
mask is clear and (2) an infinitely precise result cannot be encoded 
in the format specified for the destination 
operand. 
This fault inter- 
acts with the floating-overflow 
and floating-underflow 
faults 
(as 
described in Chapter 12). 


The floating 
reserved-encoding 
fault occurs 
when a denormalized 
value is used as an operand 
in a floating-point 
instruction 
and the 
normalizing-mode 
bit in the arithmetic controls is clear. 
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FO - Used if inexact fault occurs in conjunc- 
tion with overflow or underflow 
fault. 
If set, 


FO indicates that the adjusted result has been 
rounded toward +00; if clear, FO indicates that 
the adjusted 
result has been rounded 
toward 


Fl - 
Used 
with 
overflow 
and 
underflow 


faults only. 
If set, FI indicates 
that the ad- 


justed result has been bias adjusted, 
because 


its exponent 
was outside 
the range 
of the 


extended-real 
format. 


Fault Data: 
Used 
only 
with 
overflow 
and 
underflow 


faults. 
Adjusted 
result is stored in this field 


in extended-real 
format (as shown in Figure 


12-5). 


Addr. Fault. lost.: 
IP for the instruction 
on which the processor 


faulted 


IP for the instruction 
that would have been executed 
next, if the 


fault had not occurred. 


Changes 
in the program's 
control 
flow 
accompany 
the floating- 


overflow, 
floating-underflow, 
and floating-inexact 
faults, because a 


result is stored in the destination 
before the fault is signaled. 
The 


faulting instruction can thus not be reexecuted. 


Changes in the program's 
control flow do not accompany 
the float- 


ing invalid-operation, 
floating 
zero-divide, 
and floating 
reserved- 


encoding faults, because the faults occur before the execution 
of the 


faulting instruction. 


Machine Faults 


Fault Type: 


Fault Subtype: 


Saved IP: 


Prog. State Changes: 


816 


Number 


o 
I 
2-F 


Reserved 
Bad Access 
Reserved 


Indicates 
that the processor 
has detected 
a hardware 
or memory- 


system error. 


The bad-access 
fault is the only one of this fault type. 
This fault 
occurs 
whenever 
an 
unrecoverable 
memory 
error 
occurs 
on 
a 
memory 
operation. 
In 
the 
80960KB 
processor, 
the 
processor 
receives 
a signal on its bad access 
pin (BADAC) 
to indicate 
an 
unrecoverable 
memory 
error. 
Upon 
receiving 
this 
signal, 
the 
processor 
signals a machine bad access fault. 
There is one excep- 
tion to this action. 
The processor 
will not signal a machine 
bad 


access fault while executing 
any of the synchronous 
load or move 
instructions. 
Instead, 
it sets the condition 
code 
bits to indicate 
whether or not the memory access was completed successfully. 


Flags: 
Not used. 


Fault Data: 
Not used. 


Addr. Fault. Inst.: 
Not used. 


Not used. 


This fault may occur at any time. 
When it does occur, the accom- 


panying 
state of the program's 
control 
flow is undefined. 
As a 
result, the processor 
is not able to return predictably 
from the fault 
handler to the point in the program where the fault occurred. 


If this fault occurs during an atomic operation, there is no guarantee 
that the locking mechanism 
that memory uses for synchronization 
is 
unlocked. 
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Operation 
Faults 


Fault Type: 


Fault Subtype: 


216 


Number 


o 
I 
2 
3 
4 
5-F 


Reserved 
Invalid Opcode 
Unimplemented 
Reserved 
Invalid Operand 
Reserved 


Indicates 
that the processor 
cannot execute 
the current 
instruction 
because of invalid instruction syntax or operand semantics. 


The invalid-opcode 
fault occurs when the processor attempts to ex- 
ecute an instruction that contains an undefined opcode or addressing 
mode. 


The unimplemented 
fault occurs when unaligned 
memory 
accesses 


are not allowed and the processor 
attempts to access an unaligned 
word or group of words in memory. 
(The 80960KB processor does 
allow unaligned memory accesses, so this fault never occurs.) 


The invalid-operand 
fault occurs 
when the processor 
attempts 
to 
execute an instruction 
for which one or more of the operands 
have 
special requirements 
and one or more of the operands 
do not meet 
these requirements. 
This fault subtype is not generated 
on floating- 
point instructions. 


Flags: 


Fault Data: 


Not used. 


Not used. 


Addr. Fault. Inst.: 
IP for the last instruction 
executed 
in the 


.process. 


IP for the instruction 
that would have been executed 
next, if the 
fault had not occurred. 


A change 
in the program's 
control flow does not accompany 
the 
operation faults, because the faults occur before the execution of the 
faulting instruction. 


Fault Type: 


Fault Subtype: 


Addr. Fault. 
Inst.: 


Saved IP: 


Prog. State Changes: 


7)6 


Bit Number 


Bit 0 
Bit I 
Bit 2-7 


Reserved 
Length 
Reserved 


Indicates that the index operand used in a calls instruction 
points to 


an entry beyond the extent of the system procedure table. 


Fault Flags: 
Not used. 


Fault Data: 
Not used. 


IP for the instruction on which the processor faulted. 


Same as the address-of-faulting-instruction 
field. 


A change 
in the program's 
control flow does not accompany 
the 
protection length fault. 


Fault Type: 


Fault Subtype: 


116 


Bit Number 


Bit 0 
Bit 1 
Bit 2 
Bit 3 
Bit4 
Bit 5 
Bit6 
Bit 7 


Reserved 
Instruction Trace 
Branch Trace 
Call Trace 
Return Trace 
Pre return Trace 
Supervisor Trace 
Breakpoint Trace 


Indicates 
that the processor 
has detected 
one or more trace events. 


The processor's 
event tracing mechanism 
is described 
in detail in 
Chapter 10. 


A trace event is the occurrence 
of a particular instruction 
or type of 
instruction 
in the instruction 
stream. 
The processor 
recognizes 


seven 
different 
trace 
events 
(instruction, 
branch, 
call, 
return, 


prereturn, 
supervisor, 
and breakpoint). 
It detects 
these 
events, 


however, only if a mode bit is set for the event in the trace controls 
word, which is cached 
in the processor 
chip. 
If, in addition, 
the 
trace-enable 
flag 
in the 
process 
controls 
is set, 
the 
processor 
generates a fault when a trace event is detected. 


The fault is generated 
following 
the instruction 
that causes a trace 
event (or prior to the instruction for the prereturn trace event). 


The following trace modes are available: 


• 
Instruction 
- Generate trace event following any instruction. 


• 
Branch 
- 
Generate 
trace event following 
any branch instruc- 
tion when branch is taken. 


• 
Call - Generate trace event following 
any call or branch-and- 
link instruction, 
or implicit procedure 
call (i.e., call to fault or 
interrupt handler). 


• 
Return 
- 
Generate 
trace event following 
any return instruc- 
tion. 


• 
Prereturn 
- 
Generate 
trace event prior to any return instruc- 
tion. 


• 
Supervisor 
- 
Generate 
trace event following 
any call-system 
instruction. 


• 
Breakpoint 
- 
Generate 
trace event following 
any processor 
action that causes a breakpoint condition. 


There is a trace fault subtype 
and a bit in the fault-subtype 
field 
associated 
with each of these modes. 
Multiple 
fault subtypes 
can 
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occur simultaneously, 
with the fault-subtype 
bit set for each subtype 
that occurs. 


When a fault type other than a trace fault occurs during the execu- 
tion of an instruction 
that causes a trace event, the non-trace-fault 
is 
handled 
before 
the trace fault. 
An exception 
to this rule is the 
prereturn trace fault. 
The prereturn trace fault will occur before the 
processor 
has a chance to detect a non-trace-fault, 
so it is handled 
first. 


Likewise, 
if an interrupt occurs during an instruction 
that causes a 
trace event, the interrupt is serviced before the trace fault is handled. 
Again, the prereturn 
trace fault is an exception. 
Since it occurs 
before the instruction, 
it will be handled 
before any interrupt 
that 
might occur during the execution of the instruction. 


Flags: 


Fault Data: 


Addr. Fault. Inst.: 


Not used. 


Not used. 


IP for the instruction 
that caused 
the trace 
event, 
except 
for the prereturn 
trace 
fault.· 
For the preretum 
trace fault, this field has no 
defined value. 


IP for the instruction 
that would have been executed 
next, if the 
fault had not occurred. 


A change in the program's 
control flow accompanies 
all the trace 
faults (except the prereturn trace fault), because the events that can 
cause a trace fault occur after the faulting instruction 
is completed. 
As a result, 
the faulting 
instruction 
cannot 
be reexecuted 
upon 
returning from the fault handler. 


Since the prereturn 
trace fault occurs before the return instruction 
is executed, a change in the program's 
control flow does not accom- 
pany this fault and the faulting 
instruction 
can be executed 
upon 
returning from the fault handler. 
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Type Faults 


Fault Type: 


Fault Subtype: 


Saved IP: 


Prog. State Changes: 


Al6 


Number 


o 
I 
2-F 


Reserved 
Type Mismatch 
Reserved 


Indicates 
that an attempt was made to execute the modpc instruc- 
tion while the processor was in the user mode. 


Flags: 
Not used. 


Fault Data: 
Not used. 


Addr. Fault. Inst.: 
IP for the instruction 
on which the processor 


faulted 


Not used. 


When a type mismatch 
fault occurs, the accompanying 
state of the 


program 
is undefined. 
The processor 
is thus not able to return 


predictably 
from the fault handler to the point in the program where 


the fault occurred. 


This section describes the tracing facilities of the 80960KB processor, which aHow the monitoring 
of instruction execution. 


The 80960KB processor provides facilities for monitoring the activity of the processor by means of 
trace events. A trace event in the 80960KB is a condition where the processor has just completed 
executing a particular instruction or type of instruction, or where the processor is about to execute 
a particular instruction. 


By monitoring 
trace events, debugging 
software is able to display or analyze the activity of the 
processor or of a program. This analysis can be used to locate software or hardware bugs or for general 
system monitoring 
during the development 
of system or applications 
programs. 


The typical way to use this tracing capability is to set the processor to detect certain trace events either 
by means of the trace-controls 
word or a set of breakpoint registers. An alternate method of creating 
a trace event is with the mark and force mark (fmark) 
instructions. 
These instructions 
cause an 
explicit trace event to be generated when the processor detects them in the instruction stream. 


If tracing is enabled, the processor signals a trace fault when it detects a trace event. The fault handler 
for trace faults can then call the debugging monitor software to display or analyze the state of the 
processor when the trace event occurred. 


To use the processor's 
tracing facilities, 
software must provide trace-fault 
handling procedures, 
perhaps interfaced with a debugging monitor. Software must also manipulate 
several control flags 
to enable the various tracing modes and to enable or disable tracing in general. These control flags 
are located in the system-data 
structures described in the next section. 


Trace controls 


Trace-enable 
flag in the process controls 


Trace-fault-pending 
flag in the process controls 


Trace flag (bit 0) in the return-status 
field of register rO 


Trace-control 
flag in the supervisor-stack-pointer 
field of the system table or a procedure table 


The trace controls allow software to define the conditions under which trace events are generated. 
Figure 22 shows the structure of the trace-controls 
word. 


•• 
INSTRUCTION 
TRACE 
MODE 


LBRANCH 
TRACE 
MODE 


CALL 
TRACE 
MODE 


RETURN 
TRACE 
MODE 


PRERETURN 
TRACE 
MODE 


SUPERVISOR 
TRACE 
MODE 


BREAKPOINT 
TRACE 
MODE 


INSTRUCTION 
TRACE 
EVENT 


BRANCH 
TRACE 
EVENT 


CALL 
TRACE 
EVENT 


RETURN 
TRACE 
EVENT 


PRERETURN 
TRACE 
EVENT 


SUPERVISOR 
TRACE 
EVENT 


BREAKPOINT 
TRACE 
EVENT 


~ 
RESERVED 
(INITIALIZE 
TO 0) 


This word contains two sets of bits: the mode flags and the event flags. The mode flags define a set 
of trace modes that the processor can 66 W use to generate trace events. A mode represents a subset 
of instructions that will cause trace events to be generated. For example, when the call-trace mode 
is enabled, the processor generates a trace event whenever a call or branch-and-link 
operation 
is 
executed. To enable a trace mode, the kernel sets the mode flag for the selected trace mode in the trace 
controls. The trace modes are described later in this section. 


The processor uses the event flags to keep track of which trace events (for those trace modes that have 
been enabled) have been detected. 


A special instruction, the modify-trace-controls 
(modtc) instruction, allows software to set or clear 
flags in the trace controls. On initialization, 
all the flags in the processor's 
internal trace controls are 
cleared. The modtc instruction can then be used to set or clear trace mode flags as required. 


Software can access the event flags using the modtc instruction, however, there is no reason to. The 
processor modifies these flags as part of its trace-handling 
mechanism. 


Bits 0,8 through 0 6, and 24 through 31 of the trace controls are reserved. Software should initialize 
these bits to zero and not modify them. 


The trace-enable flag and the trace-fault-pending 
flag, in the process controls (shown in Figure 14), 


control tracing. The trace-enable flag enables the processor's 
tracing facilities. When this flag is set, 
the processor generates trace faults on all trace events. 


Typically, software selects the trace modes to be used through the trace controls. It then sets the trace- 
enable flag when tracing is to begin. This flag is also altered as part of some of the call and return 
operations that the processor carries out, as described at the end of this section. 


The trace-fault-pending 
flag allows the processor to keep track ofthe fact that an enabled trace event 
has been detected. The processor uses this flag as follows. When the processor detects an enabled 
trace event, it sets this flag. Before executing an instruction, the processor checks this flag. If the flag 
is set, it signals a trace fault. By providing a means of recording the occurrence of a trace event, the 
trace-fault-pending 
flag allows the processor to service an interrupt or handle a fault other than a trace 
fault, before handling the trace fault. Software should not modify this flag. 


The trace flag and the trace-control 
flag allow tracing to be enabled or disabled when a call-system 
instruction 
(calls) 
is executed 
that results in a switch to supervisor 
mode. This action occurs 
independent 
of whether or not tracing is enabled prior to the call. 


When a supervisor call is executed (calls instruction that references an entry in the system procedure 
table with an entry type 112), the processor saves the current state of the trace-enable flag (from the 
process controls) in the trace flag (bit 0) of the return-status 
field of register rOo 


Then, when the processor selects the supervisor procedure from the procedure table, it sets the trace- 
enable flag in the process controls according to the setting in the trace-control 
flag in the procedure 
table (bit 0 of the word that contains the supervisor-stack 
pointer). 


On a return from the supervisor procedure, the trace-enable 
flag in the process controls is restored 
to the value saved in the return-status 
field of register rOo 


Instruction trace 


Branch trace 


Call trace 


Return trace 


Pre return trace 


Supervisor trace 


Breakpoint 
trace 


These modes can be enabled individually 
or several modes can be enabled at once. Some of these 
modes overlap, such as the call-trace mode and the supervisor-trace 
mode. The section "Handling 
Multiple Trace Events" describes what the processor does when multiple trace events occur. 


When the instruction-trace 
mode is enabled, the processor generates an instruction-trace 
event each 
time an instruction is executed. This mode can be used within a debugging monitor to single-step the 
processor. 


When the branch-trace 
mode is enabled, the processor generates an branch-trace 
event any time a 
branch instruction that branches is executed. A branch-trace 
event is not generated for conditional- 
branch instructions 
that do not branch. Also, 
branch-and-link, 
call, and return instructions 
do not 
cause branch-trace 
events to be generated. 


When the call-trace 
mode is enabled, the processor generates 
a call-trace 
event any time a call 
instruction (call, calix, orcalls) or a branch-and-Iink 
instruction (bal orbalx) is executed. An implicit 
call, such as the action used to invoke a fault handler or an interrupt handler, also causes a call-trace 
event to be generated. 
. 


When the processor detects a call-trace event, it also sets the prereturn-trace 
flag (bit 3 of register 1'0) 
in the new frame created by the call operation or in the current frame if a branch-and-link 
operation 
was performed. The processor uses this flag to determine whether or not to signal a preretum-trace 
event on a return instruction. 


When the return-trace 
mode is enabled, the processor generates a return-trace 
event any time a ret 
instruction is executed. 


The preretum-trace 
mode causes the processor 
to generate a preretum-trace 
event prior to the 
execution of any ret instruction, 
providing 
the preretum-trace 
flag in rO is set. (Preretum 
tracing 
cannot be used without enabling call tracing.) 


The processor sets th,eprereturn-trace 
flag whenever it detects a call-trace event (as described above 
for the call-trace mode), This flag performs a prereturn-trace-pending 
function, If another trace event 
occurs at the same time as the prereturn-trace 
event, the prereturn-trace 
flag allows the processor to 
fault on the non-prereturn-trace 
event first, then come back and fault again on the prereturn-trace 
event 
The prereturn trace is the only trace event that can cause two successive trace faults to be 
generated between instruction boundaries. 


When the supervisor-trace 
mode is enabled, the processor generates a supervisor-trace 
event any time 
(1) a call-system 
instruction 
(calls) is executed, 
where the procedure 
table entry is a supervisor 
procedure, or (2) when a ret instruction is executed and the return-status field is set 0 1O2or 0112 (i.e., 
return from supervisor mode). 


This trace mode allows a debugging program to determine the boundaries of kernel procedure calls 
within the instruction stream. 


The breakpoint-trace 
mode allows trace events to be generated at places other than those specified 
with the other trace modes. This mode is used in conjunction with the mark and force-mark (fmark) 
instructions, 
and the breakpoint registers. 


The mark and fmark instructions 
allow breakpoint-trace 
events to be generated at specific points 
in the instruction 
stream. When the breakpoint-trace 
mode is enabled, the processor 
generates 
a 
breakpoint-trace 
event any time it encounters a mark instruction. The fmark causes the processor 
to generate a breakpoint-trace 
event regardless of whether the breakpoint-trace 
mode is enabled or 
not 


The processor has two, one-word breakpoint registers, designated as breakpoint 0 and break-point 
1.Using the set-breakpoint-register 
lAC, one instruction pointer can be loaded into each register. The 
processor 
then generates 
a breakpoint 
trace any time it executes an instruction 
referenced 
in a 
breakpoint 
register. 


A fault handler is a procedure that the processor calls to handle faults that occur. The requirements 
for fault handlers are given in Section 8, "Fault-Handler 
Procedures." 


A trace-fault handler has one additional restriction. It must be called with an implicit supervisor call, 
and the trace-control 
flag in the system-procedure-table 
entry must be clear. This restriction insures 
that tracing is turned off when a trace fault is being handled, which is necessary to prevent an endless 
loop. 
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To summarize the information presented in the previous sections, the processor signals a trace event 
when it detects any of the following conditions: 


An instruction included in a trace-mode group is executed or about to be executed (in the case 
of a preretum trace event) and the trace mode for that instruction is enabled. 


An implicit call operation has been executed and the call-trace mode is enabled. 


A mark instruction has been executed and the breakpoint-trace 
mode is enabled. 


An fmark instruction has been executed. 


An instruction specified in a breakpoint register is executed and the breakpoint-trace 
mode is 
enabled. 


When the processor detects a trace event and the trace-enable flag in the process controls is set, the 
processor performs the following action: 


I. 
The processor sets the appropriate trace-event flag in the trace controls. If a trace event meets 
the conditions of more than one of the enabled trace modes, a trace-event flag is set for each trace 
mode condition that is met. 


2. 
The processor sets the trace-fault-pending 
flag in the process controls. 


Note 


The processor 
may set a trace-event 
flag and the trace-fault-pending 
flag before it has completed 
execution 
of the instruction 
that caused the event. However, 
the processor 
only handles trace events 
in between 
the 
execution 
of instructions. 


If, when the processor detects a trace event, the trace-enable flag in the process controls is clear, the 
processor sets the appropriate 
event flags, but does not set the trace-fault-pending 
flag. 


If the processor detects multiple trace events, it records one or more of them based on the following 
precedence, 
where 1 is the highest precedence: 


1. 
Supervisor-trace 
event 


2. 
Breakpoint- 
(from mark or fmark instruction, or from a breakpoint register), branch-, call-, or 
return-trace 
event 


When multiple trace events are detected, the processor may not signal each event; however, it will 
signal at least the one with the highest precedence. 
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Once a trace event has been signaled, the processor 
determines 
how to handle the trace event, 


according to the setting of the trace-enable and trace-fault-pending 
flags in the process controls and 
to other events that might occur simultaneously 
with the trace event such as an interrupt or a non- 
trace fault. 


1. 
The processor checks the state of the trace-fault pending flag. If this flag is clear, the processor 
begins execution of the next instruction. If the flag is set, the processor performs the following 
actions. 


2. 
The processor 
checks the state of the trace-enable 
flag. If the trace-enable 
flag is clear, the 
processor clears any trace event flags that have been set, prior to starting execution of the next 
instruction. 
If the trace-enable 
flag is set, the processor performs the following action. 


3. 
The processor signals a trace fault and begins the fault handling action, as described in Section 
8. 


The processor handles a prereturn-trace 
event the same as described above except when it occurs at 
the same time as a non-trace fault. Here, the non-trace fault is handled first. 


On returning from the fault handler for the non-trace fault, the processor checks the prereturn-trace 
flag in register rOo If this flag is set, the processor generates a prereturn-trace 
event, then handles it 
as described above. 


When the processor invokes an interrupt handler to service an interrupt, it disables tracing. It does 
this by saving the current state of the process controls, then clearing the trace-enable and trace-fault- 
pending flags in the current process controls. 


On returning from the interrupt handler, the processor restores -theprocess controls to the state they 
were in prior to handling the interrupt, which restores the state of the trace-enable 
and trace-fault- 
pending flags. If these two flags were set prior to calling the interrupt handler, a trace fault will be 
signaled on the return from the interrupt handler. 
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The processor can invoke a fault handler with either an implicit local call or an implicit supervisor 
call. On a local call, the trace-enable 
and trace-fault-pending 
flags are neither saved on the call nor 
restored on the return. The state of these flags on the return is thus dependent on the action of the fault 
handler. 


On a supervisor call, the trace-enable 
and trace-fault-pending 
flags are saved, as part of the saved 
process controls, and restored on the return. So, if these two flags were set prior to calling the fault 
handler, a trace fault will be signaled on the return from the fault handler. 


Note 


On a return from an interrupt 
handler 
or a fault handler 
(other than the trace-fault 
handler), 
the trace-fault- 


pending 
flag is restored. 
If this flag is set as a result of the handler's 
ret instruction, 
the detected 
trace 
event is lost. 


This section provides detailed information about each of the instructions for the 80960KB processor. 
To provide 
quick access to information 
on a particular 
instruction, 
the instructions 
are listed 
alphabetically 
by assembly-language 
mnemonic. An explanation 
of the format and abbreviations 


used in this section is given later. 


The information in this section is oriented toward programmers 
who are writing assembly-language 
code for the 80960KB 
processor. 
The information 
provided 
for each instruction 
includes 
the 


following: 


Alphabetic reference 


Assembly-language 
mnemonic and name 


Assembly-language 
format 


Description 
of the instruction's 
operation 


Action the instruction 
carries 
out when executed 
(generally 
presented 
in the form of an 
algorithm) 


Faults that can occur during execution 


Assembly-language 
example 


Opcode and instruction format 


Related instructions 


Additional 
information 
about the instruction 
set can be found 
In the following 
sections 
and 
appendices 
in this chapter: 


Section 5 - 
Summary of the instruction set by group and description of the assembly-language 
instruction format 


Appendix A - 
Instruction Quick Reference 


Appendix B - 
Machine-Level 
Instruction Formats 


To simplify the presentation of information about the instructions, a simple notation has been adopted 
in this section. The following paragraphs describe this notation. 


The instructions 
are listed alphabetically 
by assembly-language 
mnemonic. 
If several instructions 
are related and fall together alphabetically, 
they are described as a group on a single page. 


The reference at the top of each page gives the assembly-language 
mnemonics 
for the instructions 
covered on that page (e.g., subc). Occasionally, 
there are so many instructions covered on the page 
that it is not practical to give all the mnemonics in the page reference. In these cases, the name of the 
instruction 
group is given in capital letters (e.g., BRANCH 
or FAULT IF). 
A box around the 
alphabetic reference indicates that the instruction or group of instructions are extensions to the 80960 
architecture 
instruction set. 


The Mnemonic 
section gives the complete mnemonic (in bold-face type) and instruction name for 
each instruction covered on the page, for example: 


The Format section gives the assembly-language 
format of the instruction and the type of operands 
allowed. The format is given in two or three lines. The following is an example of a two line format: 


srcl 


reg/lit 


src2, 
dst 


reg/lit 
reg 


The first line gives the assembly-language 
mnemonic 
(bold-face type) and the operands (italics). 


When the format is used for two or more instructions, an abbreviated form of the mnemonic is used. 
The "*,, sign at the end of the mnemonic 
indicates that the mnemonic has been abbreviated. 
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The second line of the fonnat shows what is allowed to be entered for each operand. The notation 
used on this line is as follows: 


reg 
Global (gO 
g5) or local (rO 
r5) register 


freg 
Global (gO 
g5) or local (rO 
r5) register, or floating-point 
(fpO ... fp3) 
register, where the registers contain floating-point 
numbers 


lit 
Integer or ordinal literal of the range 0 ... 31 


flit 
Floating-point 
literal of value 1.0 or 0.0 


disp 
Signed displacement 
of range _222... (222- 1) 


mem 
Address defined with the full range of addressing modes 


In some cases, a third line will be added to show specifically 
what will be in a register or memory 
location. For example, it may be useful to know that a register is to contain an address. The notation 
used in this line is as follows: 


addr 
Address 


efa 
Effective address 


The Description section describes what the instruction does and the functions of the operands. It also 
gives programming 
hints when appropriate. 


The Action section gives an algorithm written in a pseudo-code 
that describes in detail what actions 
the processor takes when executing the instruction and the precise order of these actions. The main 
purpose of this section is to show the possible side effects of the instruction. The following 
is an 
example of the action algorithm for the alterbit instruction: 


if (AC.cc and 2#010#) 
= 0 


then dist +- src and not (21\(bitpos mod 32)); 


else dst +- src or 21\(bitpos mod 32); 


end if; 


In these action statements, the tenn AC.cc means the condition-code 
bits in the arithmetic controls. 
The notation 2#value# means that the value enclosed in the H#" signs is in base 2. 


The Faults section lists the faults that can be signaled as the result of execution of the instruction. 
Faults listed with all-capital letters refer to a group of faults; faults listed with initial-capital 
letters 
refer to a specific fault. 


All instructions can signal a group of general faults which are referred to as STANDARD FAULTS. 
The standard faults include the trace-instruction 
and machine-bad-access 
faults. In addition, for all 
instructions have a MEM machine-format 
(such as load, store, call extended), the invalid-opcode 
and 
operation-un implemented 
faults are standard faults. 


Instruction Trace 


Branch Trace 


Call Trace 


Return Trace 


Prereturn Trace 


Supervisor Trace 


Breakpoint Trace 


OPERATION 


Invalid Opcode 


Unimplemented 


Invalid Operand 


Integer Overflow 


Arithmetic Zero-Divide 


Floating Overflow 


Floating Underflow 


Floating Invalid-Operation 


Floating Zero-Divide 


Floating Inexact 


Floating Reserved-Encoding 


Constraint Range 


Privileged 


PROTECTION 


Segment Length 


TYPE 


Type Mismatch 


. 


The Example section gives an assembly-language 
example of an application of the instruction. 


The Opcode and Instruction 
Format section gives the opcode and machine language instruction 
format for each instruction, for example: 


The machine language format is one offourpossible 
formats: REG, COBR, CTRL, and MEM. Refer 
to Appendix B for more information 
on the machine-language 
instruction formats. 


The See Also section gives the mnemonics 
of related instructions, 
which can then be looked up 
alphabetically 
in this section for comparison. For instructions that are grouped on one page (such as 
addr and addrl) 
only the first mnemonic is given. 


This section contains reference information 
on the processor's 
instructions. It is arranged alphabeti- 
cally by instruction or instruction group. 


src2, 
reg/lit 
srcl, 
reg/lit 


Description: 
Adds the src2 and srcl values, and bit I of the condition code (used here as a 
carry in), and stores the result in dst. If the ordinal addition results in a carry, 
bit I of the condition 
code is set; otherwise, 
bit I is cleared. 
If integer 


addition results in an overflow, 
bit 0 of the condition 
code is set; otherwise, 


bit 0 is cleared. 
Regardless 
of the results of the addition, bits 0 and I of the 


arithmetic controls are always written. 


The addc instruction can be used for either ordinal or integer arithmetic. 
The 


instruction does not distinguish 
between ordinal and integer source operands. 


Instead, the processor 
evaluates 
the result for both data types and sets bits 0 


and I of the condition code accordingly. 


# Let the value of the condition code be xCx. 
dst (- src2 + srcl + C; 
AC.cc (- 
2#OCV#; 


# C is carry from ordinal addition. 
# V is I if integer addition would have generated an overflow. 


# Example 
of 
double-precision 
arithmetic 
# Assume 
64-bit 
source 
operands 
# in gO,gl 
and 
g2,g3 
cmpo 
1, 
0 
# 
clears 
Bit 
1 
(carry 
bit) 
of 


# the 
AC.cc 
addc 
gO, 
g2, 
gO 
# 
add 
low-order 
32 bits; 
# gO 
(- g2 + gO 
+ Carry 
Bit 
addc 
gl, 
g3, 
gl 
# add 
high-order 
32 bits; 
# gl 
(- g3 
+ gl 
+ Carry 
Bit 
# 64-bit 
result 
is 
in gO, 
gl 
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addi, addo 


addi 
addo 


Add Integer 
Add Ordinal 


src2, 
reg/lit 
srcl, 
reg/lit 


addi 
addo 


591 
590 


Refer to discussion of faults at the begin- 
ning of this chapter. 


Result is too large for destination 
format. 


This fault is signaled only when execut- 
ing the addi instruction and if both of the 
following 
conditions 
are met: 
(1) the 
integer-overflow 
mask in the arithmetic- 
controls 
registers 
is clear 
and· (2) the 
source operands 
have like signs and the 
sign of the result 
operand 
is different 
than the signs of the source operands. 


REG 
REG 
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I addr, addrll 


Mnemonics: 
addr 
addrl 
Add Real 
Add Long Real 


srcl, 
freg/flit 
src2, 
freg/flit 
dst 
freg 


For the addrl instruction, 
if the src1, src2, or dst operand references a global 


or local register, this register is the first (lowest numbered) 
of two successive 


registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The following 
table shows the results obtained 
when adding various classes 


of numbers, assuming that neither overflow nor underflow occurs. 


-00 
-F 
-0 
+0 
+F 
+00 
NaN 


_00 
_00 
_00 
-00 
_00 
_00 
* 
NaN 


-F 
_00 
-F 
src2 
src2 
±For 
±O 
+00 
NaN 


-0 
-00 
src1 
-0 
±O 
srcl 
+00 
NaN 


+0 
_00 
src1 
±O 
+0 
src1 
+00 
NaN 


+F 
_00 
±For 
±O 
src2 
src2 
+F 
+00 
NaN 


+00 
* 
+00 
+00 
+00 
+00 
+00 
NaN 


NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 


Notes: 


F 
Means finite-real number 
Indicates floating invalid-operation exception 


When the sum of two operands with opposite signs is zero, the result is +0, 
except for the round toward 
-00 mode, in which case, the result is -0. 
When 


zero is added to itself (e.g. src1 + src1, where src1 is 0), the result retains the 
sign of the source. 


I addr, addrll 


Refer to the discussion 
of faults at the 
beginning of this chapter. 


One or more operands 
is an unnormal- 
ized (including 
denormalized) 
value and 
the normalizing-mode 
bit in the arith- 
metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 
exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Overflow 


Floating Underflow 


addr 
addrl 
78F 
79F 
REG 
REG 


Result is too large for destination format. 


Normalized 
result is too small for des- 
tination format. 


Source operands 
are infinities 
of unlike 
sign. 


One or more operands is an SNaN value. 


Result cannot 
be represented 
exactly 
in 
destination format. 


Floating overflow occurred and the over- 
flow exception was masked. 


bitpos, 
reg/lit 


src, 
reg/lit 


Description: 
Copies the src value to dst with one bit altered. 
The bitpos operand specifies 


the bit to be changed; the condition code determines 
the value the bit is to be 
changed to. If the condition code is XIX2, 
the selected bit is set; otherwise, 


it is cleared. 


Action: 
if (AC.cc and 2#010#) = 0 
then dst f- src and not (2/1(bitpos mod 32)); 
else dst f- src or 2/1(bitpos mod 32); 
end if; 


Example: 
# assume 
condition 
code 
is 
2#010# 


alterbit 
24, 
g4, 
g9 
# g9 
f- 
g4, 
with 
bit 
24 
set 
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and, andnot 


Mnemonics: 
and 
And 


andnot 
And Not 


Format: 
and 
srcJ , 
src2, 
dst 
reg/lit 
reg/lit 
reg 


andnot 
srcJ , 
src2, 
dst 


reg/lit 
reg/lit 
reg 


Description: 
Performs 
a 
bitwise 
AND 
(and 
instruction) 
or 
AND 
NOT 
(andnot 
instruction) 
operation on the src2 and srcl values and stores the result in dst. 


Note in the action expressions 
below, the src2 operand comes first, so that 
with the andnot 
instruction the expression 
is evaluated as 


{src2 andnot 
(srcl) I 


and 
Ox17, 
gB, 
g2 


andnot 
r3, 
r12, 
r9 


# g2 
f- 
gB AND 
Ox17 


# 
r9 
f- 
r12 AND 
NOT 
r3 


and 
and not 
581 
582 


REG 
REG 


inter 


src/dst, 
reg 
addr 


src, 
reg/lit 


Description: 
Adds the src value (full word) to the value in the memory location specified 
with the src/dst operand. 
The initial value from memory is stored in dst. 


The read and write of memory are done atomically 
(i.e., other processors 
are 
prevented 
from accessing 
the word of memory 
specified 
with the src/dst 
operand until the operation has been completed). 


The memory 
location 
in src/dst is the address 
of the first byte (least sig- 
nificant byte) of the word. 
The address is automatically 
aligned to a word 
boundary. 


tempa f- src/dst and not (3); 
# force alignment to word boundary 
temp f- atomic_read 
(tempa); 
atomic_write 
(tempa) f- temp + src; 
dst f- temp; 


atadd 
r8, 
r2, 
rll 
# r8 f- 
r2 + address 
r8, 
# where 
r8 
specifies 
the 
# 
address 
of 
a word 
in 
# memory; 
rll 
f- 
initial 
# value 
stored 
at 
address 
# r8 
in memory 


inter 


I atanr, atanrll 


Mnemonics: 
atanr 
atanrl 


Arctangent Real 
Arctangent 
Long Real 


srcl , 
freg/flit 
src2, 
freg/flit 
dst 
freg 


Description: 
Calculates 
the arctangent of the quotient of src21srcl and stores the result in 
dst. 
The result is returned 
in radians and is in the range of -n to +n, in- 


clusive. 
The sign of the result is always the sign of src2. 


For the atanrl 
instruction, if the srcl, src2, or dst operand references a global 


or local register, this register is the first (lowest numbered) of two successive 
registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


These instructions 
are commonly 
used as part of an algorithm 
to convert 
rectangular coordinates 
to polar coordinates. 
They can also be used to imple- 
ment the FORTRAN 
intrinsic functions 
AT AN and ATAN2. 
If srcl 
is the 
floating-point 
literal value + 1.0, then these instructions 
return a result in the 


range of -n12 to +nI2. 


The following 
table gives the range of results for various values of src2 and 
srcl, assuming that neither overflow nor underflow occurs. 


-00 
-F 
-0 
+0 
+F 
+00 
NaN 


_00 
-3n/4 
-n/2 
-n/2 
-n/2 
-n/2 
-n/4 
NaN 


-F 
-n 
-n to -n/2 
-n/2 
-n/2 
-n/2 to -0 
-0 
NaN 


-0 
-n 
-n 
-n 
-0 
-0 
-0 
NaN 


+0 
+n 
+n 
+n 
+0 
+0 
+0 
NaN 


+F 
+n 
+n to +n/2 
+n/2 
+n/2 
+ n/2 to +0 
+0 
:t\'aN 


+00 
+ 3n/4 
+n/2 
+n/2 
+n/2 
+ n/2 
+n/4 
NaN 


NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 


Notes: 


F 
Means finite-real number. 


I atanr, atanrll 


Refer to the discussion 
of faults at the 


beginning of this chapter. 


One or more operands 
is an unnormal- 


ized (including 
denormalized) 
value and 


the normalizing-mode 
bit in the arith- 


metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 
exception 
results in a fault being raised depends on the state of its associated 


mask bit in the arithmetic controls. 


Floating Underflow 


Floating Invalid Operation 


Result is too small for destination 
format. 


One 
or 
more 
operands 
are 
an 
SNaN 


value. 


Result 
cannot 
be represented 
exactly 
in 


destination format. 


atanrl 
g8, 
glO, 
fp3 
# fp3 
f- 


# arctan 
(glO,gll/g8,g9) 
atanrl 
1.0, 
gO, 
gO 
# gO,gl 
f- 
arctan 
(gO,gl) 


atanr 
atanrl 
680 
690 
REG 
REG 


mask, 
reg/lit 


srcldst 
reg 
src, 
reg 
addr 


Description: 
Copies the src/dst value into the memory location specified in src. 
The bits 


set in the mask operand select the bits to be modified in memory. 
The initial 
value from memory is stored in src/dst. 


The read and write of memory are done atomically 
(i.e., other processors 
are 


prevented 
from accessing 
the word of memory 
specified 
with the srcldst 
operand until the operation has been completed). 


The memory location in src is the address of the first byte (least significant 
byte) of the word to be modified. 
The address is automatically 
aligned to a 


word boundary. 


tempa f- src and not (3); 
# force alignment to word boundary 


temp f- atomicJead 
(tempa); 


atomic_write 
(tempa) f- (src/dst and mask) 
or (temp and not(mask)); 


srcldst f- temp; 


atmod 
g5, 
g7, 
glO 
# 
g5 
f- 
g5 masked 
by 
g7, 


# where 
g5 
specifies 
the 


# address 
of 
a word 
in 
# memory; 
# 
glO 
f- 
initial 
value 
# 
stored 
at 
address 
g5 


# in memory 


inter 


bal, balx 


bal 
balx 
Branch And Link 
Branch And Link Extended 


targ 
disp 


targ, 
mem 


Description: 
Stores the address of the next instruction (the instruction following the bal or 
balx 
instruction) 
and branches 
to the instruction 
specified 
with the targ 


operand. 


With the bal instruction, 
the address 
of the next instruction 
is stored 
in 
register g14. 
The targ operand can be either a label or an absolute 
address 


that specifies the IP of the target instruction. 
This value can be no farther 
than _223 to (223 - 4) from the current IP. 


The balx instruction 
performs 
almost the same operation 
as the bal instruc- 
tion except that the target instruction 
can be farther than _223 to (223 
- 4) 


from the current 
IP. 
With the balx 
instruction, 
the address 
of the next 
instruction 
is stored 
in dst. 
The targ operand 
is a memory 
type, which 
allows the full range of addressing 
modes to be used to specify the IP of the 
target instruction. 
Here, the "IP + displacement" 
addressing mode allows the 
instruction to be IP-relative. 
Indirect branching can be performed 
by placing 
the target address 
in a register 
and then using one of the register-indirect 
addressing modes. 


Refer to Chapter 5 for a complete discussion 
of the addressing 
modes avail- 
able with memory-type 
operands. 


Note 


At the machine level, the bal instruction 
uses the CTRL instruction 
format. 


With this format, the target instruction 
for the branch is specified by means 
of a word-displacement 
(represented 
by displacement 
in the following 
ac- 
tion statement 
for the bal instruction), 
which can range from _221 to (221 
- 
1). To determine 
the IP of the target instruction, 
the processor converts this 
displacement 
value to a byte displacement 
(i.e., multiplies 
the value by 4). 


It then adds the resulting byte displacement 
to the current IP. 


bal, balx 


To allow labels or absolute 
addresses 
to be used in the assembly-language 
version 
of the bal instruction, 
the Intel 80960KB 
Assembler 
performs 
the 
following 
calculation 
to convert 
the targ value 
in an assembly-language 
instruction 
to the displacement 
value required 
by the machine 
instruction 
format: 


displacement 
= (targ/4) - IP 


For further information 
about the CTRL instruction 
format, refer to Appen- 


dix B. 


G 14 f--- IP + 4; 
# destination 
next IP is always g 14 


IP f--- IP + targ; 
# resume execution at the new IP 


balx: 
dst f--- IP + inst length; # instruction length 
# is 4 or 8 bytes 


IP f--- targ; 
# resume execution at the new IP 


balx 
(g2), g4 
# IP 
f--- 
(g2); 
# address 
of 
return 
instruction 
# is 
stored 
in g4; 
example 
of 
# indirect 
addressing. 


bal 
balx 
CTRL 
MEM 


b,bx 


Branch 
Branch Extended 


targ 
mem 


With the b instruction, 
the targ operand can be either a label or an absolute 
address that specifies the IP of the target instruction. 
This value can be no 
farther than _223 to (223 - 4) from the current IP. 


The bx instruction 
performs 
the same operation 
as the b instruction 
except 
that the target instruction can be farther than _223 to (223 - 4) from the current 
IP. With the bx instruction, the targ operand is a memory type, which allows 
the full range of addressing 
modes to be used to specify the IP of the target 
instruction. 
Here, the "IP + displacement" 
addressing 
mode allows the in- 
struction to be IP-relative. 
Indirect branching 
can be performed 
by placing 
the target address 
in a register 
and then using one of the register-indirect 
addressing modes. 


Refer to Chapter 5 for a complete discussion 
of the addressing 
modes avail- 
able with memory-type 
operands. 


Note 


At the machine 
level, the b instruction 
uses the CTRL 
instruction 
format. 


With this format. the target instruction 
for the branch is specified by means 
of a word-displacement 
(represented 
by displacement 
in the following 
ac- 
tion statement 
for the b instruction), 
which can range from _221 to (221 - I). 
To determine 
the IP of the target 
instruction, 
the processor 
converts 
this 
displacement 
value to a byte displacement 
(i.e., multiplies 
the value by 4). 
It then adds the resulting byte displacement 
to the current IP. 


To allow labels or absolute 
addresses 
to be used in the assembly-language 
version 
of the b instruction, 
the Intel 80960KB 
Assembler 
performs 
the 
following 
calculation 
to convert 
the targ value 
in an assembly-language 
instruction 
to the displacement 
value required 
by the machine 
instruction 
format: 


For further information 
about the CTRL instruction 
format, refer to Appen- 
dix B. 


b,bx 


bx 
1332 
(ip) 
# 
IP 
f- 
IP 
+ 
1332; 


# this 
example 
uses 
ip-re1ative 
# addressing. 


CTRL 
MEM 


bal, balx, BRANCH 
IF, COMPARE INTEGER AND BRANCH, 
COM- 
PARE ORDINAL AND BRANCH 


inter 


bbc,bbs 


bbc 
bbs 
Check Bit and Branch If Clear 
Check Bit and Branch If Set 


bitpos, 
regllit 


src, 
reg 


Description: 
Checks the bit in src (designated 
by bitpos) and sets the condition code in the 
arithmetic 
controls 
according 
to the value found. 
The processor 
then per- 
forms a conditional 
branch based on the value of the condition code. 


For the bbc instruction, 
if the selected bit in src is clear, the processor 
sets 
the condition code to 0102 and branches to the instruction 
specified with the 
targ operand; 
otherwise, 
it sets the condition 
code to 0002 and goes to the 
next instruction. 


For the bbs instruction, 
if the selected bit is set, the processor 
sets the con- 
dition code to 0102 and branches to targ; otherwise, it sets the condition code 
to 0002 and goes to the next instruction. 


When using the Intel 80960KB 
Assembler, 
the targ o~erand can be either a 
label or an absolute address that is no farther than _21 to (212 
- 4) from the 
current IP. 


Note 


At the machine 
level, the bbc and bbs instructions 
use the COBR instruc- 
tion format. 
With 
this format, 
the target 
instruction 
for the branch 
is 


specified by means of a word-displacement 
(represented 
by displacement 
in 
the following 
action statement), 
which can range from _210 to (210 - 1). To 
determine 
the IP of the target 
instruction, 
the processor 
converts 
this 
displacement 
value to a byte displacement 
(i.e., multiplies 
the value by 4). 


It then adds the resulting byte displacement 
to the IP of the next instruction. 


To allow labels or absolute 
addresses 
to be used in the assembly-language 
versions 
of the bbc 
and bbs 
instructions, 
the Intel 80960KB 
Assembler 
performs the following calculation 
to convert the targ value in an assembly- 
language 
instruction 
to the displacement 
value 
required 
by the machine 
instruction 
format: 


For further information 
about the COBR instruction 
format, refer to Appen- 
dix B. 


inter 


if (src and 2"(bitpos mod 32)) = 0 
then AC.cc ~ 
2#010#; 


IP ~ 
IP + 4 + (displacement * 4); 


# resume execution at the new IP 
else AC.cc ~ 
2#000#; 


IP ~ 
IP + 4; # resume execution at the next IP 
end if; 


if (src and 2"(bitpos mod 32)) = 1 
then AC.cc ~ 
2#0 10#; 


IP ~ 
IP + 4 + (displacement * 4); 
# resume execution at the new IP 
else AC.cc ~ 
2#000#; 


IP ~ 
IP + 4; # resume execution at the next IP 
end if; 


bbc,bbs 


# 
assume 
bit 
10 of 
r6 
is clear 
bbc 
10, 
r6, 
xyz 
# bit 
10 of 
r6 
is checked 


# and 
found 
clear; 
# AC.cc 
~ 
2#010# 
# 
IP 
~ 
XYZi 


bbc 
bbs 
COBR 
COBR 


Mnemonics: 
be 
boe 
bl 
ble 
bg 
bge 
bo 
boo 


Branch If Equal 
Branch If Not Equal 
Branch If Less 
Branch If Less Or Equal 
Branch If Greater 
Branch If Greater Or Equal 
Branch If Ordered 
Branch If Unordered 


targ 
disp 


Description: 
Branches to a new instruction 
according to the state of the condition code in 


the arithmetic controls. 


For 
all branch-if 
instructions 
except 
the 
boo 
instruction, 
the 
processor 


branches 
to the instruction 
specified 
with the targ operand, 
if the logical 
AND of the condition 
code and the mask-part 
of the opcode 
is not zero. 
Otherwise, it goes to the next instruction. 


For the boo instruction, 
the processor 
branches 
to the instruction 
specified 
with targ, if the logical AND of the condition code and the mask-part 
of the 


opcode is zero. Otherwise, 
it goes to the next instruction. 


When using the Intel 80960KB Assembler, 
the targ operand can be either a 


label or an absolute 
address 
that s:Eecifies the IP of the target instruction. 


This value can be no farther than -2 3 to (223 - 4) from the current IP. 


At the machine 
level, the branch-if 
instructions 
use the CTRL 
instruction 
format. 
With this format, the target instruction 
for the branch is specified 
by means 
of a word-displacement 
(represented 
by displacement 
in the 


following 
action statements), 
which can range from _221 to (221 
- I). 
To 


determine 
the 
IP of the target 
instruction, 
the processor 
converts 
this 


displacement 
value to a byte displacement 
(Le., multiplies 
the value by 4). 


It then adds the resulting byte displacement 
to the current IP. 


intel 


To allow labels or absolute 
addresses 
to be used in the assembly-language 
version 
of the branch-if 
instructions, 
the Intel 80960KB 
Assembler 
per- 
forms the following 
calculation 
to convert 
the targ value in an assembly- 
language 
instruction 
to the displacement 
value 
required 
by the machine 
instruction 
format: 


For further information 
about the CTRL instruction 
format, refer to Appen- 
dixB. 


Instruction 
Mask 
Condition 


bno 
000 
Unordered 


bg 
001 
Greater 


be 
010 
Equal 


bge 
011 
Greater or equal 


bl 
100 
Less 


bne 
101 
Not equal 


ble 
110 
Less or equal 


bo 
III 
Ordered 


For the bno instruction (unordered), 
the branch is taken if the condition code 
is equal to 0002. 


if (mask and AC.cc) '# 2#000# 
then IP ~ 
IP + displacement; 
# resume execution at new IP 
end if; 


if AC.cc = 2#000# 
then IP.~ 
IP + displacement; 
# resume execution at new IP 
end if; 


intel 


# assume 
AC.cc 
AND 
2#100# 
are 
# 0 
bl 
xyz 
# 
IP 
f- 
xyz; 


be 
12 
CTRL 
boe 
15 
CTRL 
bl 
14 
CTRL 
ble 
16 
CTRL 
bg 
11 
CTRL 
bge 
13 
CTRL 
bo 
17 
CTRL 
boo 
10 
CTRL 


b,bx 


Description: 
Calls a new procedure. 
The processor 
performs 
a local call operation 
as 


described 
in Chapter 4 in the section titled "Local Calls." 
As part of this 


operation, the processor allocates a new set of local registers and a new stack 
frame for the called procedure. 
The processor 
then goes to the instruction 
specified with the targ argument and begins execution of the new procedure. 


When using the Intel 80960KB Assembler, 
the targ operand can be either a 


label or an absolute address that specifies the IP of the first instruction 
in the 
called procedure. 
This value can be no farther than _223 to (223 - 4) from the 


current IP. 


At the machine level, the call instruction 
uses the CTRL instruction 
format. 


With this format, the first instruction 
of the called procedure 
is specified by 
means of a word-displacement 
(represented 
by displacement 
in the follow- 
ing action statement), 
which can range from _221 to (221 - I). To determine 
the IP of the target 
instruction, 
the processor 
converts 
this displacement 
value to a byte displacement 
(i.e., multiplies 
the value by 4). 
It then adds 
the resulting byte displacement 
to the current IP. 


To allow labels or absolute addresses 
to be used in the assembly-language 
version of the call instruction, 
the Intel 80960KB 
Assembler 
performs 
the 
following 
calculation 
to convert 
the targ value 
in an assembly-language 
instruction 
to the displacement 
value required 
by the machine 
instruction 
format: 


displacement 
= (targ/4) - IP 


For further information 
about the CTRL instruction 
format, refer to Appen- 


dix B. 


wait for any uncompleted 
instructions to finish; 
temp f- (SP + 63) and not (63); # round to next boundary 
RIP f- IP; 
if register_secavailable 
then allocate as new frame; 
else save a register_set 
in memory at its FP; 
allocate as new frame; . 


# local register references now refer to new frame 
IP f- IP + displacement; 
PFP f- FP; 
FP f- temp; 
SP f- temp + 64; 


inter 


targ 
reg/lit 


Description: 
Calls a system procedure. 
The targ operand gives the number of the proce- 
dure being called. 


For 
this 
instruction, 
the 
processor 
performs 
the 
system 
call 
operation 
described in Chapter 4 in the section titled "System Calls." 
The targ operand 
provides an index to an entry in the system procedure table. 
From this entry, 


the processor gets the IP of the called procedure. 


The procedure 
called can be either a local procedure 
or a supervisor 
proce- 


dure, depending on the entry type in the procedure 
table. 
If it is a supervisor 
procedure, 
the processor also switches to supervisor mode (if it is not already 


in this mode). 


As part of this operation, 
the processor 
allocates a new set of local registers 


and a new stack frame for the called procedure. 
If the processor 
switches to 


the supervisor mode, the new stack frame is created on the supervisor stack. 


inter 


if targ > 259 then raise Protection Length Fault; 
wait for any uncompleted 
instructions to finish; 
temp_P3 
f- memory (SPT, 48 + (4 * targ)); 
# SPT is pointer to system procedure table from IMI 
RIP f- IP; 
IP f- temp_P3.address; 
if (temp_p_e.type 
= local) or 
execution_mode 
= supervisor 
then temp f- (SP + 63) and not(63); 
tempRRR 
f- 2#000#; 
else temp f- memory (SPTSS, 12); 
# supervisor call 
tempRRR f- 2#OIT#; 
# T is process_controls.T 
execution_mode 
f- supervisor; 
process30ntrols.T 
f- temp.T; 
endif; 
if frame_available 
then allocate as new frame; 
else save a frame in memory at its FP; 
allocate as new frame; 
# local register references now refer to new frame 
endif; 
PFP f- FP; 
LO.RRR f- tempRRR; 
FP f- temp; 
SP f- temp + 64; 


calls 
r12 
# IP f- 
value 
obtained 
from 
# procedure 
table 
for procedure 
# 
number 
given 
in 
r12 


inter 


targ 
mem 


Description: 
Calls a new procedure. 
The processor 
performs 
a local call operation 
as 


described 
in Chapter 4 in the section titled "Local Calls." 
As part of this 


operation, the processor allocates a new set of local registers and a new stack 
frame for the called procedure. 
The processor 
then goes to the instruction 
specified with the targ argument and begins execution of the new procedure. 


This instruction 
performs 
the same operation 
as the call instruction 
except 


that the target instruction can be farther than _223 to (223 - 4) from the current 
IP. 


The targ operand is a memory type, which allows the full range of address- 
ing modes to be used to specify the IP of the target instruction. 
The "IP + 
displacement" 
addressing 
mode allows the instruction 
to be IP-relative. 
In- 


direct calls can be performed 
by placing the target address in a register and 
then using one of the register-indirect 
addressing modes. 


Refer to Chapter 5 for a complete discussion 
of the addressing 
modes avail- 


able with memory-type 
operands. 


wait for any uncompleted 
instructions to finish; 


temp f- (SP + 63) and not (63); # round to next boundary 
RIP 
f- IP; 


if register_set_available 


then allocate as new frame; 
else save a register_set 
in memory at its FP; 


allocate as new frame; 


# local register references now refer to new frame 
endif; 
IP f- targ; 
PFP f- FP; 
FP f- temp; 
SP f- temp + 64; 


callx 
(g5) 
# IP f- 
(g5), where 
the 
address 
# 
in g5 
is the 
address 
of 
the 
new 


# 
procedure 


in1er 


bitpos, 
reg/lit 
src 
reg/lit 


Description: 
Checks the bit in src designated by bitpos and sets the condition code accord- 
ing to the value found. 
If the bit is set, the condition 
code is set to 0102; if 
the bit is clear, the condition code is set to 0002, 


Action: 
if (src and 21\(bitpos mod 32» = 0 
then AC.cc f- 2#000#; 
else AC.cc f- 2#010#; 
end if; 


inter 


I classr, classrl! 


cIassr 
cIassrl 
Classify Real 
Classify Long Real 


src 
freg/flit 


Description: 
Checks the classification 
of the real number 
in src and stores the class in 


arithmetic-status 
bits (3 through 6) of the arithmetic controls. 


For the cIassrl instruction, 
if the src operand 
references 
a global or local 
register, 
this 
register 
is the 
first 
(lowest 
numbered) 
of two 
successive 


registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The following table shows the setting of the arithmetic-status 
bits depending 


on the classification 
of the operand. 


AStatus 
Classification 


sOOO 
Zero 


sOOl 
Denormalized 
number 


sOlO 
Normal finite number 


sOlI 
Infinity 


s100 
Quiet NaN 


s10l 
Signaling NaN 


sIlO 
Reserved operand 


Refer to Chapter 7 for a discussion 
of the different 
real number classifica- 


tions. 


s (- sign_of(src) 
if src = 0 
then arithmetic_status 
(- sOOO; 
elseif src = denormalized 
then arithmetic_status 
(- sOOl; 
elseif src = normal finite 
then arithmetic_status 
(- sOlO; 
elseif src = 
00 


then arithmetic_status 
(- sOlI; 
elseif src = QNaN 
then arithmetic_status 
(- s100; 
elseif src = SNaN 
then arithmetic_status 
(- s101; 


elseif src = reserved operand 
then arithmetic_status 
(- sIlO; 


end if 


I classr, classrl! 


Refer to the discussion 
of faults 
at the 


beginning of this chapter. 


None of the floating-point 
exceptions can be raised. 


classr 
classrl 
68F 
69F 
REG 
REG 


inte!" 


bitpos, 
reg/lit 
src, 
reg/lit 


Description: 
Copies the src value to dst with one bit cleared. 
The bitpos operand specifies 


the bit to be cleared. 


clrbit 
23, 
g3, 
g6 
# g6 
f- 
g3 with 
bit 
23 
# 
cleared 


intel" 


cmpi, cmpo 


Mnemonics: 
cmpi 
cmpo 
Compare Integer 
Compare Ordinal 


srcl, 
reg/lit 


src2 
reg/lit 


Description: 
Compares 
the src2 and srcl values and sets the condition code according 
to 
the results of the comparison. 
The following 
table shows the setting of the 
condition code for the three possible results of the comparison. 


Condition 
Comparison 
Code 


100 
srcl < src2 


010 
srcl = src2 


001 
srcl > src2 


The cmpi instruction 
followed by one of the branch-if 
instructions 
is equiv- 


alent 
to one of the compare-integer-and-branch 
instructions. 
The 
latter 
method of comparing 
and branching produces more compact code; however, 


the former method can result in faster running code because it takes advan- 
tage of the processor's 
pipelined architecture. 
The same is true for the comp 
instruction and the compare-ordinal-and-branch 
instructions. 


if srcl < src2 then AC.cc f- 2#100#; 
elseif srcl = src2 then AC.cc f- 2#010#; 
else AC.cc f- 2#001#; 
end if; 


cmpo 
OxIO, 
r9 
, # compare 
values 
in 
r9 
and 
OxlO 


# and 
set 
condition 
code 


cmpi 
cmpo 
SAI 
SAO 
REG 
REG 


cmpdeci, cmpdeco 


Mnemonics: 
cmpdeci 
cmpdeco 
Compare and Decrement Integer 
Compare and Decrement Ordinal 


src1 , 
reg/lit 
src2, 
reg/lit 


Description: 
Compares 
the src2 and src1 values and sets the condition code according 
to 


the results of the comparison. 
The src2 operand is then decremented 
by one 


and the result is stored in dst. 


The following 
table shows the setting of the condition 
code for the three 


possible results of the comparison. 


Condition 
Comparison 
Code 


100 
src1 < src2 


010 
src1 = src2 


001 
src1 > src2 


These instructions 
are intended 
for use in ending iterative 
loops. 
For the 


cmpdeci 
instruction, 
interger 
overflow 
is ignored 
to allow looping 
down 


through the minimum integer values. 


if src1 < src2 then AC.cc ~ 
2#100#; 
else if src1 = src2 then AC.cc ~ 
2#010#; 


elseif src1 > src2 then AC.cc ~ 
2#001#; 
end if; 
dst ~ 
src2 - 1; #overflow suppressed for cmpdeci 
# instruction 


# g7 
and 
12 are 
compared; 
# 
gl 
~ 
g7 
- 
1 


cmpdeci 
cmpdeco 
SA7 
SA6 
REG 
REG 


cmpinci, cmpinco 


Mnemonics: 
cmpinci 
cmpinco 
Compare and Increment Integer 
Compare and Increment Ordinal 


srcl, 
reg/lit 
src2, 
reg/lit 


Description: 
Compares 
the src2 and srcJ values and sets the condition 
code according 
to 
the results of the comparison. 
The src2 operand is then incremented 
by one 
and the result is stored in dst. 


The following 
table shows the setting of the condition 
code for the three 
possible results of the comparison. 


Condition 
Comparison 
Code 


100 
srcJ < src2 


010 
srcJ = src2 


001 
srcJ > src2 


These instructions 
are intended 
for use in ending iterative 
loops. 
For the 
cmpinci 
instruction, 
integer overflow is ignored to allow looping up through 
the maximum integer values. 


if srcJ < src2 then AC.cc f- 2#100#; 
else if srcJ = src2 then AC.cc f- 2#010#; 
elseif srcJ > src2 then AC.cc f- 2#001#; 
end if; 
dst f- src2 + 1; # overflow suppressed for cmpinci 
# instruction 


# 
g2 
and 
r8 are 
compared; 
# 
g9 
f- 
g2 
+ 
1 


cmpinci 
cmpinco 
5A5 
5A4 
REG 
REG 


inter 


~or, 
cmliQill 


Mnemonics: 
cmpor 
cmporl 
Compare Ordered Real 
Compare Ordered Long Real 


srcl, 
freg/flit 
src2 
freg/flit 


Description: 
Compares 
the src2 and srcl values and sets the condition 
code according 
to 
the results of the comparison. 


For the cmporl 
instruction, 
if the srcl or src2 operand references 
a global or 
local register, 
this register is the first (lowest numbered) 
of two successive 
registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The following 
table shows the setting of the condition 
code for the four 
possible results of the comparison. 


Condition 
Comparison 
Code 


100 
srcl < src2 


010 
srcl = src2 


001 
srcl > src2 


000 
if either srcl or src2 
is a NaN 


The algorithm for these instructions checks the classification 
of the operands. 


If either is in the NaN class, the condition code is set to 0002 and a floating 
invalid-operation 
exception 
is raised. 
The cmpor 
and cmporl 
instructions 


operate the same as the cmpr 
and cmprl 
instructions, 
except that the latter 
instructions do not signal an exception if a NaN value is detected. 


If a floating-reserved-encoding 
fault occurs, the condition 
code results are 
undefined. 


if srcl < src2 
then AC.cc f- 2#100#; 
elseif srcl = src2 
then AC.cc f- 2#010#; 
elseif srcl > src2 
then AC.cc f- 2#001#; 
else AC.cc f- 2#000#; # indicates one number is a NaN 
raise floating invalid operation fault 
end if; 


inter 


~or, 
empori I 


Refer to the discussion 
of faults at the 
beginning of this chapter. 


One or more operands 
is an unnormal- 
ized (including 
denormalized) 
value and 
the normalizing-mode 
bit in the arith- 
metic controls is set. 


The following 
floating-point 
exception 
can be raised. 
Whether 
or not the 
exception results in a fault being raised depends on-the state of its associated 
mask bit in the arithmetic controls. 


# compare 
value 
in g12,g13 
# with 
value 
in g6,g7 


cmpor 
cmporl 
684 
694 
REG 
REG 


Mnemonics: 
cmpr 
cmprl 
Compare Real 
Compare Long Real 


srcl, 
freg/flit 
src2 
freg/flit 


Description: 
Compares the src2 and src1 values and sets the condition code according 
to 


the results of the comparison. 
For the cmprl 
instruction, 
if the src1 or src2 


operand references 
a global or local register, this register is the first (lowest 


numbered) of two successive registers. 


The following 
table shows the setting of the condition 
code for the four 


possible results of the comparison. 


Condition 
Comparison 
Code 


100 
src1 < src2 


010 
srcl = src2 


001 
src1 > src2 


000 
if either src1 or src2 
is a NaN 


The algorithm for these instructions 
checks the classification 
of the operands. 


If either is in the NaN class, the condition code is set to 0002, but no fault is 
raised. 
The cmpr and cmprl 
instructions operate the same as the cmpor 
and 


cmporl 
instructions, 
except 
that the latter 
instructions 
raise 
an invalid- 


operand exception if a NaN value is detected. 


If a floating-reserved-encoding 
fault occurs, the condition 
code results are 


undefined. 


if src1 < src2 
then AC.cc f- 2#100#; 


elseif src1 = src2 
then AC.cc f- 2#010#; 


elseif src1 > src2 
then AC.cc f- 2#001#; 
else AC.cc f- 2#000#; # indicates one number is a NaN 
end if; 


inter 


~r,cm~ 


Refer to the discussion 
of faults at the 


beginning of this chapter. 


One or more operands 
is an unnormal- 
ized (including 
denOJ:malized) value and 
the normalizing-mode 
bit in the arith- 


metic controls is set. 


The following 
floating-point 
exception 
can be raised. 
Whether 
or not the 
exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


One 
or 
more 
operands 
are 
an 
SNaN 
value. 


cmprl 
g2, 
g6 
# compare 
values 
in g6,g7 
# and 
g2,g3 


cmpr 
cmprl 
685 
695 
REG 
REG 


intel 


Mnemonics: 
em pi be 
empibne 
empibl 
empible 
empibg 
empibge 
empibo 
empibno 


empobe 
empobne 
empobl 
empoble 
empobg 
empobge 


Compare Integer And Branch If Equal 
Compare Integer And Branch If NotEqual 
Compare Integer And Branch If Less 
Compare Integer And Branch If Less Or Equal 
Compare Integer And Branch If Greater 
Compare Integer And Branch If Greater Or Equal 
Compare Integer And Branch If Ordered 
Compare Integer And Branch If Unordered 


Compare Ordinal And Branch If Equal 
Compare Ordinal And Branch If Not Equal 
Compare Ordinal And Branch If Less 
Compare Ordinal And Branch If Less Or Equal 
Compare Ordinal And Branch If Greater 
Compare Ordinal And Branch If Greater Or Equal 


Format: 
empib* 
srcl, 
src2, 
targ 
reg/lit 
reg 


empob* 
srcl, 
src2, 
targ 
reg/lit 
reg 
disp 


Description: 
Compares 
the src2 and srcl values and sets the condition code according 
to 
the results of the comparison. 
If the logical AND of the condition code and 
the mask-part of the opcode is not zero, the processor branches to the instruc- 
tion specified 
with the targ operand; 
otherwise, 
the processor 
goes to the 
next instruction. 


When using the Intel 80960KB 
Assembler, 
the targ o~erand can be either a 
label or an absolute address that is no farther than _21 to (212 - 4) from the 
current IP. 


Note 


At the machine 
level, the compare-and-branch 
instructions 
use the COBR 
instruction 
format. 
With this format, the target instruction 
for the branch is 
specified by means of a word-displacement 
(represented 
by displacement 
in 
the following 
action statement), 
which can range from _210 to (210 
- I). To 
determine 
the IP of the 
target 
instruction, 
the processor 
converts 
this 
displacement 
value to a byte displacement 
(i.e., multiplies 
the value by 4). 
It then adds the resulting byte displacement 
to the IP of the next instruction. 


inter 


To allow labels or absolute 
addresses 
to be used in the assembly-language 
versions 
of these instructions, 
the Intel 80960KB 
Assembler 
performs 
the 
following 
calculation 
to convert 
the targ value 
in an assembly-language 


instruction 
to the displacement 
value required 
by the machine 
instruction 


format: 


For further information 
about the COBR instruction 
format, refer to Appen- 


dixB. 


Instruction 
Mask 
Branch Condition 


cmpibno 
000 
No Condition 


cmpibg 
001 
srcl > src2 


cmpibe 
010 
srcl = src2 


cmpibge 
all 
SIT] ;?: src2 


cmpibl 
100 
srcl < src2 


cmpibne 
101 
srcl 
"# src2 


cmpible 
110 
srcl ~ src2 


cmpibo 
111 
Any Condition 


cmpobg 
001 
srcl > src2 


cmpobe 
010 
srcl = src2 


cmpobge 
all 
srcl 
;?: src2 


cmpobl 
100 
srcl < src2 


cmpobne 
101 
srcl 
"# src2 


cmpoble 
110 
srcl ~ src2 


The cmpibo 
instruction 
always 
branches; 
the cmpibno 
instruction 
never 


branches. 


The functions 
that these instructions 
perform can be duplicated 
with a cmpi 
instruction 
followed 
by a branch-if 
instruction, 
as described 
in the descrip- 
tion of the cmpi instruction in this chapter. 


inter 


if srcl < src2 then AC.cc f- 2#100#; 
elseif srcl = src2 then AC.cc f- 2#010#; 
else AC.cc f- 2#00 I#; 
end if; 
if mask and AC.cc 
:j:. 2#000# 
then IP f- IP + 4 + (displacement 
* 4); 


# resume execution at the new IP 
else IP f- IP + 4; 


# resume execution at the next IP 
end if; 


# assume 
g3 < 
g9 
cmpibl 
g3, 
g9, 
xyz 
# g9 
is compared 
with 
g3; 
# 
IP f- 
xyz. 


# assume 
r7 ~ 19 
cmpobge 
r7, 
19, 
xyz 
# 19 is 
compared 
with 
r7 


# 
IP f- 
xyz. 


cmpibe 
3A 
COBR 
cmpibne 
3D 
COBR 
cmpibl 
3C 
·COBR 
cmpible 
3E 
COBR 
cmpibg 
39 
COBR 
cmpibge 
3B 
COBR 
cmpibo 
3F 
COBR 
cmpibno 
38 
COBR 


cmpobe 
32 
COBR 
cmpobne 
35 
COBR 
cmpobl 
34 
COBR 
cmpoble 
36 
COBR 
cmpobg 
31 
COBR 
cmpobge 
33 
COBR 


BRANCH 
IF, cmpi 


intel" 


concmpi, concmpo 


Mnemonics: 
concmpi 
concmpo 


Conditional 
Compare Integer 
Conditional 
Compare Ordinal 


srcl, 
reg/lit 
src2 
reg/lit 


Description: 
Compares the src2 and srcl values if bit 2 of the condition code is not set. If 
the comparison 
is performed, 
the condition 
code 
is set according 
to the 
results of the comparison. 


These instructions 
are provided 
to facilitate 
bounds checking 
by means of 
two-sided 
range 
comparisons 
(e.g., is A between 
Band 
C?). 
They 
are 
generally 
used after a compare 
instruction 
to test whether 
a value is in- 
clusively between two other values. 


The example below illustrates this application by testing whether the value in 
g3 is between the values in g5 and g6, where g5 is assumed to be less than 
g6. First a comparison 
(cmpo) of g3 and g6 is performed. 
If g3 is less than 
or equal to g6 (i.e., condition 
code is either 0 IO2 or 001), a conditional 
comparison 
(concmpo) 
of g3 and g5 is then performed. 
If g3 is greater than 
or equal to g5 (indicating 
that g3 is within the bounds of g5 and g6), the 
condition code is set to 0102; otherwise, it is set to 0012, 


if (AC.cc and 2# I00#) = 0 then 
if srcl $ src2 
then AC.cc ~ 
2#010; 


else AC.cc ~ 
2#001; 


endif; 
endif; 


cmpo 
g6, 
g3 
# compares 
g6 
and 
g3 
and 
sets 
# condition 
code 
concmpo 
g5, 
g3 
# if 
condition 
code 
is not 


# 2#lxx#, 
g5 
is 
compared 


# with 
g3 


concmpi 
5A3 


concmpo 
5A2 
REG 
REG 


inter 


I cosr, cosrl! 


Mnemonics: 
cosr 
cosrl 
Cosine Real 
Cosine Long Real 


src, 
freg/flit 
dst 
freg 


Description: 
Calculates 
the cosine of the value in src and stores the result in dst. 
The src 
value is an angle given in radians. 
The resulting dst value is in the range -1 
to +1, inclusive. 


For the cosrl instruction, 
if the src or dst operand references a global or local 
register, 
this 
register 
is the 
first 
(lowest 
numbered) 
of two 
successive 
registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The following 
table shows the results obtained 
when taking the cosine of 
various classes of numbers with neither overflow nor underflow. 


Src 
Dst 


-00 
* 
-F 
-1to + 1 


-0 
+1 


+0 
+1 


+F 
-1to + 1 


+llll 
* 
NaN 
NaN 


Notes: 
F 
Means finite-real 
number 
Indicates 
floating invalid-operation 
exception 


In the trigonometric 
instructions, 
the 80960KB 
uses a value for 
1t with a 
66-bit mantissa which is 2 bits more ·than are available 
in the extended-real 
format. 
The section in Chapter 
12 titled "Pi" gives this 1t value, along with 
some suggestions for representing 
this value in a program. 


inter 


I cosr, cosrll 


Refer to the discussion 
of faults at the 
beginning of this chapter. 


One or more operands 
is an unnormal- 
ized (including 
denormalized) 
value and 
the normalizing-mode 
bit in the 
arith- 
metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 
exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


The src operand is 00 • 


.One 
or 
more 
operands 
are 
an 
SNaN 
value. 


Result 
cannot 
be represented 
exactly 
in 
destination 
format. 


# cosine 
of value 
in 
r8,r9 
is 


# stored 
in g2,g3 


cosr 
cosrl 
68D 
69D 
REG 
REG 


intel 


~yrsre, 
cpysre 
I 


Mnemonics: 
cpysre 
cpyrsre 
Copy Sign Real Extended 
Copy Reversed Sign Real Extended 


srcl, 
freg/flit 


src2, 
freg/flit 
dst 
freg 


Description: 
Copies the absolute 
value of srcl 
into dst. 
For the cpysre 
instruction, 
the 
sign of src2 is copied to dst; for the cpyrsre 
instruction, 
the opposite of the 
sign of src2 is copied to dst. 


If the srcl, 
src2, or dst operand 
references 
a global or local register, 
this 


register is the first (lowest numbered) of three successive registers. 
Also, the 


number of this register must be a multiple of four (e.g., gO, g4, g8). 


These instructions 
only operate on values in the extended-real 
format. 
The 
same operations 
can be performed 
on real- and long-real 
values using the 


setbit and c\earbit 
instructions, 
or a combination 
of the chkbit 
and alterbit 
instructions. 


if src2 is positive 
then dst f- abs (srcl) 
else dst f- -abs (srcl) 


if src2 is negative 
then dst f- abs (srcl) 
else dst f- -abs (srcl) 


Refer to the discussion 
of faults at the 


beginning of this chapter. 


One or more operands is a denormalized 
value 
and the normalizing-mode 
bit in 


the arithmetic controls is set. 


cpysre 
fpO, 
fpl, 
fp2 
# absolute 
value 
from 
fpO 
is 
copied 
to 


# 
fp2; 
sign 
from 
fpl 
is copied 
to 
fp2 


cpysre 
cpyrsre 
6E2 
6E3 


REG 
REG 


inter 


I cvtilr, cvtir I 


Mnemonics: 
cvtilr 
cvtir 
Convert Long Integer to Real 
Convert Integer to Real 


src, 
reg/lit 
dst 
freg 


Description: 
Converts the integer in src to a real and stores the result in dst. For the cvtilr 
instruction, 
the src operand 
references 
the first (lowest 
numbered) 
of two 
successive registers. 
Also, this register must be even numbered (e.g., gO, g2, 
g4). 


Converting 
an integer to long real format requires two instructions. 
First, the 
integer 
is converted 
to extended 
real format 
by using the cvtir or cvtilr 
instruction 
with a floating-point 
register as a destination. 
Then the movrl 
instruction 
is used to move the value from the floating-point 
register to two 
global or local registers, 
causing an explicit conversion 
to long real format. 


(Note that this conversion 
is always 
exact.) 
The example 
section 
below 
illustrates this conversion. 


Refer to the discussion 
of faults at the 
beginning of this chapter. 


The following 
floating-point 
exception 
can be raised. 
Whether 
or not the 
exception 
results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 
. 


Can only be signaled when converting 
an 
integer to real (32-bit) format 


# 
Conversion 
of 
an 
integer 
to 
a long 
real 
value 


cvtir 
g6, 
fp3 
movrl 
fp3, 
g8 
# 
result 
stored 
in g8,g9 


cvtir 
cvtilr 
674 
675 
REG 
REG 


inter 


I cvtri, cvtril, cvtzri, cvtzrill 


Mnemonics: 
cvtri 
cvtril 
cvtzri 
cvtzril 


Convert Real To Integer 
Convert Real To Integer Long 
Convert Truncated Real To Integer 
Convert Truncated Real.To Long Integer 


src, 
freg/flit 


For the cvtril and cvtzril instructions, 
the dst operand 
references 
the first 


(lowest numbered) 
of two successive 
registers. 
Also, this register must be 


even numbered (e.g., gO, g2, g4). 


The nontruncated 
versions of these instructions 
round according 
to the cur- 


rent rounding mode in the Arithmetic 
Controls register. 
The truncated 
ver- 


sions always round toward zero. 


Converting 
a long real value to an integer requires two instructions. 
First, 


the long real value is converted 
to extended 
real format by using the movrl 


instruction 
with a floating-point 
register 
as a destination. 
(Note that this 


operation 
is always exact.) 
Then one of the convert real-to-integer 
instruc- 


tions is used to move the value from the floating-point 
register to one or two 


global or local registers. 
The example section below illustrates 
this conver- 


sion. 


If the magnitude 
of the result cannot be represented 
in the destination, 
an 


integer-overflow 
fault is raised, 
and the maximum 
positive 
or maximum 


negative 
value is stored in the destination 
(depending 
on whether 
the real 


value was positive or negative, respectively). 


dst ~ 
integer (srcl); 
# src1 is rounded to integer value 


I cvtri, cvtril, cvtzri, cvtzrill 


Refer to the discussion 
of faults at the 


beginning of this chapter. 


The following exception can be raised. Whether or not the exception results 
in a fault being raised depends on the state of its associated mask bit in the 
arithmetic controls register. 


# Conversion 
of 
movrl 
g4, 
fp2 


long 
real 
value 
to 
an 
integer 
# long-real 
source 
is 


# converted 
to 
extended-real 
# 
format 
and 
moved 
to 
fp2 


cvtril 
fp2, 
g12 
# extended-real 
value 
is 


# 
converted 
to 
long 
integer 


cvtri 
cvtril 
cvtzri 
cvtzril 


6CD 
6CI 
6C2 
6C3 


REG 
REG 
REG 
REG 


inter 


I daddc I 


srcl, 
reg 


src2, 
reg 


Description: 
Adds bits 0 through 3 of src2 and srcl and bit 1 of the condition code (used 
here as a carry bit). 
The result is stored in bits 0 through 
3 of dst. 
If the 


addition results in a carry, bit 1 of the condition code is set. 
Bits 4 through 
31 of src are copied to dst unchanged. 


This 
instruction 
is intended 
to be used 
iteratively 
to add binary-coded- 


decimal (BCD) values in which the least-significant 
four bits of the operands 


represent the decimal numbers 0 to 9. The instruction asssumes that the least 
significant 
4 bits of both operands are valid BCD numbers. 
If these bits are 


not valid BCD numbers, the resulting value in dst is unpredictable. 


# Let the value of the condition code be xCx. 
dst +--- src2 + srcl + C; 
AC.cc +--- 2#OCO#; 
# C is carry from addition of bits 0 through 4 of operands 
# Bits 4 - 31 of dst are same as bits 4 - 31 of src2 


daddc 
g5, 
g9, 
glO 
# glO 
+--- 
g9 
+ g5 
+ Carry 
Bit, 
# where 
arithmetic 
is 
# 
carried 
out 
only 
on bits 
0 


# through 
3 of the 
operands 


inter 


divi 
divo 
Divide Integer 
Divide Ordinal 


divi, diva 


srcl, 
reg/lit 
src2, 
reg/lit 


Refer to discussion of faults at the begin- 
ning of this chapter. 


The srcl operand is O. 


The 
following 
fault 
condition 
can 
be raised 
with 
the 
divi 
instruction. 


Whether or not a fault is raised depends on the state of its associated mask bit 
in the arithmetic-controls 
register. 


divi 
74B 
divo 
70B 
REG 
REG 


inter 


I divr, divrll 


divr 
divrl 
Divide Real 
Divide Long Real 


src2, 
freg/flit 
dst 
freg 
srcl , 
freg/flit 


For the divrl instruction, 
if the srcl, src2, or dst operand references 
a global 
or local register, this register is the first (lowest numbered) 
of two successive 
registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The sign of the result is always the exclusive-OR 
of the source signs, even if 
one or more of the source values is 0, 00, or a NaN. 


The following table shows the results obtained when dividing various classes 
of numbers, assuming that neither overflow nor underflow occurs. 


-00 
-F 
-0 
+0 
+F 
+00 
NaN 


-00 
* 
+00 
+00 
_00 
_00 
* 
NaN 


-F 
+0 
+F 
** 
** 
-F 
-0 
NaN 


-0 
+0 
+0 
* 
* 
-0 
-0 
NaN 


+0 
-0 
-0 
* 
* 
+0 
+0 
NaN 


+F 
-0 
-F 
** 
** 
+F 
+0 
NaN 


+00 
* 
_00 
-00 
+00 
+00 
* 
NaN 


NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 


Notes: 


F 
Means finite-real number. 
Indicates floating invalid-operation 
exception. 
Indicates floating zero-divide exception. 


inter 


Idivr, divrll 


Refer to the discussion 
of faults at the 
beginning of this chapter. 


One or more operands 
is an unnormal- 
ized (including 
denormalized) 
value and 
the normalizing-mode 
bit in the 
arith- 
metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 
exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Overflow 


FlOllting Underflow 


Floating Zero Divide 


Result is too large for destination format. 


Result is too small for destination 
format. 


The 
srcl 
operand 
is 0 
and 
the 
src2 
operand is numeric and finite. 


Both source operands 
are 0 or both are 


One 
or 
more 
operands 
are 
an 
SNaN 
value. 


Result 
cannot 
be represented 
exactly 
in 
destination format. 


divr 
divrl 
78B 
79B 
REG 
REG 


src, 
reg 


Description: 
Copies the src value into dst. The least-significant 
eight bits of the src value 
are tested to determine 
whether or not they constitute 
a valid ASCII decimal 
(001100002 
.. 001110012), 
and the condition code is set accordingly. 
If the 
value is a valid ASCII decimal, the condition code is set to 0002; otherwise, 
it is set to 0102. 


dst ~ 
src; 
if src = 2#0011000# 
.. 2#00111001# 
then AC.cc ~ 
2#000#; 
else AC.cc ~ 
2#010#; 


dmovt 
gl, 
g6 
# g6 
~ 
gl; 


# 
gl 
tested 
for decimal 
value 


inter 


Idsubc! 


srcl, 
reg 


src2, 
reg 


Description: 
Subtracts 
bits 0 through 3 of src2 and srcl 
and bit 1 of the condition 
code 


(used here as a carry bit). 
The result is stored in bits 0 through 3 of dst. 
If 


the subtraction 
results in a carry, bit 1 of the condition 
code is set. 
Bits 4 


through 31 of src are copied to dst unchanged. 


This instruction 
is intended to be used iteratively 
to subtract binary-coded- 


decimal (BCD) values in which the least-significant 
four bits of the operands 


represent the decimal numbers 0 to 9. The instruction asssumes that the least 
significant 4 bits of both operands are valid BCD numbers. 
If these bits are 


not valid BCD numbers, the resulting value in dst is unpredictable. 


# Let the value of the condition code be xCx. 
dst (- src2 - srcl - I + C; 
AC.cc (- 
2#OCO#; 


# C is carry from subtraction of bits 0 through 4 of operands 
# Bits 4 - 31 of dst are same as bits 4 - 31 of src2 


dsubc 
r1, 
r2, 
r12 
# r12 
(- r2 
- r1 
-1 
+ Carry 


# Bit, 
where 
arithmetic 
is 


# carried 
out 
only 
on bits 
0 


# 
through 
3 of 
the 
operands 


inter 


srcl , 
reg/lit 
src2, 
reg/lit 


Description: 
Divides src2 by srcl 
and stores the result in dst. 
The src2 value is a long 
ordinal (Le., 64 bits), which is contained 
in two adjacent registers. 
The src2 


operand specifies the lower numbered 
register, which contains the least sig- 
nificant bits of the operand. 
The src2 operand must be an even numbered 
register (i.e., rD, r2, r4, ... or gO, g2, ...). The srcl value is a normal ordinal 
(Le., 32 bits). 


The remainder 
is stored in the register designated 
by dst and the quotient is 


stored in the next highest numbered 
register. 
The dst operand 
must be an 
even numbered register (i.e., rD, r2, r4, ... or gO, g2, ...). 


If this operation 
overflows 
(i.e., the quotient or remainder 
do not fit in 32- 


bits), no fault is raised and the result is undefined. 


dst f- (src2 - (src2 / srcl) * srcl); # remainder 
dst + 1 f- (src2 / srcl); # quotient 


ediv 
g3, 
g4, 
glO 
# glO 
f- 
remainder 
of 
g4,gS/g3 


# gll 
f- 
quotient 
of 
g4,gS/g3 


inter 


srcl, 
reg/lit 
src2, 
reg/lit 


Description: 
Multiplies 
src2 by srcl 
and stores the result in dst. 
The result is a long 
ordinal 
(i.e., 64 bits), which is stored in two adjacent 
registers. 
The dst 
operand specifies the lower numbered 
register, which receives the least sig- 
nificant 
bits of the result. 
The dst operand 
must be an even 
numbered 
register (i.e., rD, r2, r4, ... or gO, g2, ...). 


dst ~ 
(srcl * src2) mod 21\32; 
dst + 1 ~ (src * src2)/mod 
21\32; 


inter 


~r, 
exQill 


expr 
exprl 
Exponent Real 
Exponent Long Real 


src, 
freglflit 
dst 
freg 


Description: 
Calculates 
an approximation 
of the exponential 
value of 2 to the src power, 


minus 1, and stores the result in dst. The src value must be within the range 
of -0.5 to +0.5, inclusive. 
If the src value is outside this range, the result is 


undefined. 


For the exprl instruction, 
if the src or dst operand references a global or local 


register, 
this 
register 
is the 
first 
(lowest 
numbered) 
of two 
successive 


registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The following table shows the results obtained when computing the exponent 
of various classes of numbers. 


Src 
Dst 
-0.5 to-O 
-(llV2)-1 to-O 


-0 
-0 


+0 
+0 
+Oto+0.5 
+OtoV2-1 


Notes: 


••• 
Results are unpredictable 


~r, 
exlill] 


Refer to the discussion 
of faults at the 


beginning of this chapter. 


One or more operands 
is an unnormal- 


ized (including 
denormalized) 
value and 


the normalizing-mode 
bit in the arith- 


metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 
exception 
results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Underflow 


Floating Invalid Operation 


Result is too small for destination 
format. 


One 
or 
more 
operands 
are 
an 
SNaN 
value. 


Result 
cannot 
be represented 
exactly 
in 
destination format. 


Example: 
# 
y = 2"x 
(y and 
x in gO) 


# 
uses 
identity 


# 
2"x 
2" (IH) 
# 
= 2"1 
* 
((2"f - 1)+1) 


# 
where: 
I integer, 
-0.5 
<= 
f <= 
+0.5 
# 
assumes 
round-to-nearest 
# 
does 
not 
handle 
infinities 
or NaNs 


_pow2x: 


roundr 
gO,fpO 
# 
I in 
fpO 


subr 
fp'O,gO,gO 
# 
f in gO 
expr 
gO,gO 
addr 
Of1.0,gO,gO 
cvtri 
fpO,gl 


scaler 
gl,fpO,gO 


Opcode: 
expr 
689 
REG 


exprl 
699 
REG 


See Also: 
scaler,logr 


bitpos, 
reg/lit 
len, 
reg/lit 
src/dst 
reg 


Description: 
Shifts a specified bit field in src/dst right and fills the bits to the left of the 
shifted bit field with zeros. 
The bitpos value specifies the least significant bit 
of the bit field to be shifted, and the len value specifies the length of the bit 
field. 


Action: 
src/dst f- (src/dst / 2/\(bitpos mod 32)) 
and (2/\len - 1); 


extract 
5, 
12, 
g4 
# 
g4 
f- 
g4 
with ,bits 5 
# through 
16 shifted 
right 


inter 


faulte 
faultne 
faultl 
faultle 
faultg 
faultge 
faulto 
faultno 


Fault If Equal 
Fault If Not Equal 
Fault If Less 
Fault If Less Or Equal 
Fault If Greater 
Fault If Greater Or Equal 
Fault If Ordered 
Fault If Unordered 


Description: 
Raises a constraint-range 
fault if the logical AND of the condition 
code and 
the mask-part of the opcode is not zero. 


Instruction 
Mask 
Condition 


faultno 
000 
Unordered 


faultg 
001 
Greater 


faulte 
010 
Equal 


faultge 
011 
Greater or equal 


faultl 
100 
Less 


faultne 
101 
Not equal 


faultle 
110 
Less or equal 


faulto 
III 
Ordered 


For the faultno 
instruction 
(unordered), 
the fault is raised if the condition 
code is equal to 2#000#. 


if (mask and AC.cc)"# 2#000# 
then raise constraint-range 
fault; 


if AC.cc = 2#000# 
then raise constraint-range 
fault; 


inter 


# assume 
2#110# 
AND 
AC.cc * 2#000# 
faultle 
# raises 
Constraint 
Range 
Fault 


faulte 
IA 
CTRL 
faultne 
1D 
CTRL 
faultl 
IC 
CTRL 
faultle 
IE 
CTRL 
faultg 
19 
CTRL 
faultge 
IB 
CTRL 
faulto 
IF 
CTRL 
faultno 
18 
CTRL 


be, teste 


inter 


flushreg 


Description: 
Copies the contents of all the cached local-register 
sets into their associated 
register-save 
areas in the procedure 
stack. 
The contents 
of all the local- 
register 
sets except for the current 
set are then marked 
as invalid. 
On a 


return, the local registers for the frame being returned to are then loaded from 
the stack. 


The nushreg 
instruction 
is provided 
to allow 
a compiler 
or applications 


program 
to circumvent 
the normal call/return 
mechanism 
of the processor. 


For example, a compiler may need to back up several frames in the stack on 
the next return, rather than using the normal return mechansim 
that returns 


one frame at a time. 
Here, the compiler 
uses the nushreg 
instruction 
to 


update 
the stack with the current 
states of the saved 
register 
sets. 
The 


compiler can then return to any frame in the stack without losing the contents 
of the saved local-register 
sets. 
To return to a frame other than the frame 
directly below the current frame, the complier 
merely modifies 
the PFP in 
register 
rO of the current frame to point to the frame that it wishes to return 
to. 


Each register set except the current set is flushed to its associated stack frame 
in memory and marked as purged, meaning that they will be reloaded from 
memory if and when they become the current local register set. 


Description: 
Generates a breakpoint 
trace-event, 
regardless of the setting of the breakpoint 


trace mode flag. 
- 


When a breakpoint 
trace event is detected, 
the trace-fault-pending 
flag (bit 


10) of the process controls word and the breakpoint-trace-event 
flag (bit 23) 


of the trace controls are set. 
Before the next instruction 
is executed, 
a trace 


fault is generated. 


if process. trace_controls 
and breakpoint_trace_flag 
then 
raise trace breakpoint fault 


ld xyz, 
r4 
addi 
r4, 
r5, 
r6 


fmark 
# Breakpoint 
trace 
event 
is generated 
at 


# this 
point 
in the 
instruction 
stream. 


Id 
Idob 
Idos 
Idib 
Idis 
Idl 
Idt 
Idq 


Load 
Load Ordinal Byte 
Load Ordinal Short 
Load Integer Byte 
Load Integer Short 
Load Long 
Load Triple 
Load Quad 


src, 
mem 


Description: 
Copies a byte or string of bytes from memory 
into a register 
or group of 
successive registers. 
The src operand specifies the address of the first byte to 


be loaded. 
The full range of addressing 
modes may be used in specifying 
src. 
(Refer to Chapter 5 for a complete discussion 
of the addressing 
modes 
available with memory-type 
operands.) 


The dst operand specifies a register or the first (lowest numbered) 
register of 


successive registers. 


The Idob and Idib, and Idos and Idis instructions 
load a byte and half word, 


respectively, 
and convert it to a full 32-bit word. 
The Id, Idl, Idt, and Idq 
instructions 
copy 4, 8, 12, and 16 bytes, respectively, 
from memory 
into 
successive registers. 


For the Idl instruction, 
dst must specify an even numbered 
register (e.g., gO, 


g2, ..., gI2). 
For the Idt and Idq instructions, 
dst must specify a register 
number 
that is a multiple 
of four (e.g., gO, g4, g8). 
If the data extends 


beyond register gl5 or rl5 for the Idl, Idt, or Idq instruction, 
the results are 
unpredictable. 


ldl 
2456 
(r3), rIO 
# rIO, 
rII 
f- 
value 
of 
two 


# words 
beginning 
at 
offset 
# 2456 
plus 
the 
address 
in 


# r3 
in memory 


LOAD 


Opcode: 
Id 
90 
MEM 
Idob 
80 
MEM 
Idos 
88 
MEM 
Idib 
CO 
MEM 
Idis 
C8 
MEM 
Idl 
98 
MEM 
Idt 
AO 
MEM 
Idq 
BO 
MEM 


See Also: 
MOVE, STORE 


intel" 


src 
mem 
efa 


Description: 
Computes 
the effective address specified 
with src and stores it in dst. 
The 


src address is not checked for validity. 


An important 
application 
of this instruction 
is to load a constant longer than 


5 bits into a register. 
(To load a register with a constant of 5 bits or less, the 


move instruction (mov) can be used with a literal as the src operand.) 


lda 
58 
{g9}, gl 
# Computes 
the 
effective 


# 
address 
specified 
with 
# 58 
{g9} and 
stores 
it 
in gl 


# loads 
the 
constant 
16#749# 
# 
in 
r8 


inter 


Ilogbnr, logbnrll 


logbnr 
logbnrl 
Log Binary Real 
Log Binary Long Real 


src, 
freg/flit 
dst 
freg 


Description: 
Calculates 
the logz (src) and stores the integral part of this value (i.e., the 


part to the left of the binary point) as a real number in dst. 
The result of this 


operation 
is an unbiased exponent. 
When src is a denormalized 
number, dst 


is the unbiased 
exponent 
that src would have if the format 
had unlimited 


exponent range. 


(The fractional 
part of logz (src) is ignored. 
If the fractional 
part is needed, 


use the logr or logrl instruction.) 


This instruction 
implements 
the IEEE recommended 
function 
10gb. 
It is 


useful for calculating the order of magnitude of a number. 


For the logbnrl 
instruction, 
if the src2 or dst operand references 
a global or 
local register, 
this register is the first (lowest numbered) 
of two successive 


registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The following table shows the results obtained when taking the log binary of 
various 
classes of numbers, 
assuming 
that neither overflow 
nor underflow 


occurs. 


Src 
Dst 
-00 
+00 


-F 
±F 


-0 
** 
+0 
** 
+F 
±F 


+00 
+00 


NaN 
NaN 


Notes: 
F 
Means finite-real number 
Indicates floating zero-divide exception 


inter 


Ilogbnr, logbnrll 


Note that the significand 
of the sre operand 
can be extracted 
by using the 
scaler or scalerl instruction. 


dst (- (log2 (unbiased exponent (sre» - fraction); 
# the integral part of the unbiased exponent of sre 
# is stored in dst as a biased real 


Refer to the discussion 
of faults at the 
beginning of this chapter. 


One or more operands 
is an unnormal- 
ized (including 
denormalized) 
value and 
the normalizing-mode 
bit in the arith- 
metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 
exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Underflow 


Floating Invalid Operation 


Result is too small for destination 
format. 


One 
or more 
operands 
are 
an 
SNaN 
value. 


Result cannot 
be represented 
exactly 
in 
destination format. 


The sre operand is O. 


# fp3 (- integral 
part 
# of 
log2 
(g12,g13) 


logbnr 
logbnrl 
68A 
69A 
REG 
REG 


intel 


Ilogepr, loge~ 


logepr 
logeprl 
Log Epsilon Real 
Log Epsilon Long Real 


src1 , 
freg/flit 
src2, 
freg/flit 
dst 
freg 


For the logeprl 
instruction, 
if the src1, src2, or dst operand 
references 
a 


global or local register, 
this register 
is the first (lowest numbered) 
of two 


successive registers. 
Also, this register must be even numbered 
(e.g., gO, g2, 


g4). 


The following table shows the results obtained when taking the log epsilon of 
various classes of numbers, 
assuming 
that neither overflow 
nor underflow 


occurs. 


(1/v'2) -1to -0 
-0 
+0 
+ 0 tov'2-1 
NaN 


-00 
-00 
* 
* 
_00 
NaN 


-F 
+F 
+0 
-0 
-F 
NaN 


-0 
+0 
+0 
-0 
-0 
NaN 


+0 
-0 
-0 
+0 
+0 
NaN 


+F 
-F 
-0 
+0 
+F 
NaN 


+00 
+00 
* 
* 
+00 
NaN 


NaN 
NaN 
NaN 
NaN 
NaN 
NaN 


Notes: 


F 
Means finite-real number. 
Indicates floating invalid-operation 
exception. 


This instruction offers optimal accuracy for values of src1 + 1 close to 1 (i.e., 
for values of src1 close to 0). 
This cxpression 
is commonly 
found in com- 
pound interest and annuity calculations. 
The result can be simply converted 
into a value in another logarithm base by including a scale factor in src2. 


inter 


Ilogepr, loge~ 


The following 
equation 
is used to calculate 
the scale factor for a particular 
logarithm base, where n is the logarithm base desired for the result stored in 
dst: 


When the srcl operand is outside this range, the logr or logrl instruction can 
be used with very insignificant 
loss of accuracy by adding 1.0 to srcJ. 


Refer to the discussion 
of faults at the 
beginning of this chapter. 


One or more operands 
is an unnormal- 
ized (including 
denormalized) 
value and 
the normalizing-mode 
bit 
in the 
arith- 
metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 
exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Overflow 


Floating Unt:lerflow 


Floating Invalid Operation 


Result is too large for destination format. 


Result is too small for destination format. 


The 
srcJ 
operand 
is 0 
and 
the 
src2 
operand is 00. 


The srcJ operand does not fall within the 
range 
defined 
in the above 
description 
section. 


One 
or 
more 
operands 
are 
an 
SNaN 
value. 


Result cannot 
be represented 
exactly 
in 
destination format. 


intel" 


Ilogepr, logelirl] 


Example: 
logepr 
g8, 
g4, 
fp2 


# 
fp2 
f- 
g4,g5 
* 
log2 
(g8,g9 
+ _1) 


Opcode: 
logepr 
681 
REG 
logeprl 
691 
REG 


See Also: 
logr 


inter 


!Iogr, logrl! 


logr 
logrl 
Log Real 
Log Long Real 


srcl, 
freg/flit 
src2, 
freg/flit 
dst 
freg 


Description: 
Calculates 
(src2 * logz (srcl)), 
and stores the result in dst. 
(The logbnr and 
logbnrl 
instructions 
perform 
this function 
more efficiently, 
if only an es- 


timate is needed.) 


For the logrl instruction, 
if the srcl, src2, or dst operand references 
a global 
or local register, this register is the first (lowest numbered) 
of two successive 
registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The following 
table 
shows 
the results 
obtained 
when 
taking 
the log of 
various 
classes of numbers, 
assuming 
that neither overflow 
nor underflow 
occurs. 


-F 
-0 
+0 
+F 
I NaN 
_00 
+00 


-00 
* 
* 
** 
** 
±oo 
_00 
NaN 


-F 
* 
* 
** 
** 
±F 
_00 
NaN 


-0 
* 
* 
* 
* 
±O 
* 
NaN 


+0 
* 
* 
* 
* 
±O 
* 
NaN 


+F 
*- 
* 
** 
** 
±F 
+00 
NaN 


+00 
* 
* 
** 
** 
±oo 
+00 
NaN 


NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 


Notes: 


F 
Means finite-real number. 
Indicates floating invalid-operation 
exception. 


Indicates floating zero-divide exception. 


The logr instruction combined 
with the expr instruction 
forms the basis for 
the power function xY• 


inter 


Ilogr, logrll 


Adding 
1.0 to a number 
to be used as the srcl 
operand 
will cause infor- 


mation 
to be lost. 
To perform 
this function, 
use the logepr 
or logeprl 


instruction. 


These instructions 
provide a simple method of converting 
the result of the 


logz arithmetic 
into a value in another logarithm 
base by including 
a scale 
factor in src2. The following equation is used to calculate the scale factor for 
a particular 
logarithm 
base, where n is the logarithm 
base desired for the 


result stored in dst; 


Refer to the discussion 
of faults at the 


beginning of this chapter. 


One or more operands 
is an unnormal- 


ized (including 
denormalized) 
value and 


the normalizing-mode 
bit in the 
arith- 
metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 
exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Overflow 


Floating Underflow 


Floating Zero Divide 


Result is too large for destination format. 


Result is too small for destination 
format. 


The srcl 
operand 
is 0 and src2 is non- 
zero. 


The srcl and src2 operands are both O. 


The 
srcl 
operand 
is 
00 
and 
the 
src2 


operand is O. 


The 
srcl 
operand 
is 
and 
the 
src2 


operand is 00. 


The srcl 
operand 
is negative 
and non- 
zero. 


One 
or 
more 
operands 
are 
an 
SNaN 
value. 


Result cannot 
be represented 
exactly 
in 
destination format. 


intel 


Ilogr, logrll 


logr 
logrl 


682 
692 


REG 
REG 


inter 


Description: 
Generates 
a breakpoint 
trace event if the breakpoint 
trace mode has been 
enabled. 
The breakpoint 
trace mode is enabled if the trace-enable 
bit (bit 0) 


of the process controls and the breakpoint-trace 
mode bit (bit 7) of the trace 
controls have been set. Both these words are located in the PCB. 


When a breakpoint 
trace event is detected, 
the trace-fault-pending 
flag (bit 


10) of the process controls and the breakpoint-trace-event 
flag (bit 23) of the 


trace controls are set. Before the next instruction 
is executed, a trace fault is 
generated. 


If the breakpoint-trace 
mode has not been enabled, 
the mark 
instruction 


behaves like a no-op. 


# Assume 
that 
the 
breakpoint 
trace 
mode 
is 
# enabled. 
ld xyz, 
r4 
addi 
r4, 
r5, 
r6 


mark 
# 
Breakpoint 
trace 
event 
is generated 
at 
# this 
point 
in the 
instruction 
stream. 


infel" 


mask, 
reg/lit 


src, 
reg/lit 


Description: 
Reads and modifies 
the arithmetic 
controls. 
The src operand 
contains 
the 


value to be placed in the arithmetic controls and the mask operand specifies 
the bits that may be changed. 
Only the bits set in mask are modified 
in the 
arithmetic 
controls. 
Once the arithmetic 
controls 
have been changed, 
their 
initial state is copied into dst. 


Action: 
temp ~ 
AC 
AC ~ 
(src and mask) or 


(AC and not (mask)); 


# AC 
~ 
g9, 
masked 
by 
gl 
# g12 
~ 
initial 
value 
of AC 


srcl, 
reg/lit 
src2, 
reg/lit 


Description: 
Divides 
src2 
by src1, 
where 
both 
are 
integers, 
and 
stores 
the modulo 
remainder of the result in dst. 
If the result is nonzero, dst is given the same 
sign as src 1. 


Action: 
dst f- src2 - «src2/src1) * src1); 
if src2 * src1 < 0 
then dst f- dst + src1; 


end if; 


modify 


mask, 
reg/lit 
src, 
reg/lit 
src/dst 
reg 


Description: 
Modifies 
selected 
bits in src/dst 
with bits from src. 
The mask operand 
selects the bits to be modified: 
only the bits set in the mask are modified in 
src/dst. 


inter 


modpc 


src, 
reg/lit 
mask, 
reg/lit 
src/dst 
reg 


Description: 
Reads 
and modifies 
the processor's 
internally 
cached 
process 
controls 
as 
specified with mask and src/dst. 
The src/dst operand contains the value to be 
placed in the process controls 
and the mask operand 
specifies 
the bits that 
may be changed. 
Only the bits set in the mask are modified 
in the process 


controls. 
Once the process controls have been changed, their initial value is 


copied into src/dst. 
The src operand is a dummy operand that should be set 
equal to the mask operand. 


The processor must be in the supervisor mode to modify the process controls 
using this instruction. 
If the mask operand is set to 0, this instruction 
can be 
used to read the process controls, without the processor 
being in the super- 
visor mode. 


If the action of this instruction 
results in the priority of the processor 
being 
lowered, the interrupt table is checked for pending interrupts. 


Changing 
the state, resume, 
internal 
state, and trace enable 
fields of the 
process controls can lead to unpredictable 
behavior, 
as described 
in Chapter 
7 in the section titled "Changing the Process-Controls 
Word." 


if mask *" 0 
then if process.process_controls.execution_mode 
*" supervisor 
then raise type-mismatch 
fault; 
end if; 
temp ~ 
process.process_controls; 


process. process_controls 
~ 
(mask and src/dst) or 
(process.process_controls 
and not (mask)); 
src/dst ~ 
temp; 
if (temp. priority > process.process30ntrols.priority 
then check_pending_interrupts; 
# if continue here, no interrupt to do 
end if; 
else src/dst ~ process.process_controls; 
end if; 


inter 


modpc 


# 
process 
controls 
f- 
g8 


# masked 
by 
g9 


inter 


mask, 
reg/lit 


src, 
reg/lit 


Description: 
Reads and modifies the trace controls for the current process. 
The processor 
changes 
its internally 
cached trace controls as specified 
with mask and src. 


The src operand contains the value to be placed in the trace controls and the 
mask operand specifies the bits that may be changed. 
Only the bits set in the 


mask are modified 
in the trace controls. 
Once the trace controls have been 
changed, their initial state is copied into dst. 


This instruction only affects the trace controls cached in processor. 
The trace 
controls in the PCB for the current process are not affected. 


Since bits 8 through 
15 and 24 through 
31 of the trace-controls 
word are 


reserved, the mask operand is ANDed with OOFFOOFFl6 
to insure that these 


bits are not set in the mask. 


The changed trace controls take effect on the first non-branching 
instruction 
fetched from memory. 
Since instructions 
are prefetched 
four at a time, the 
trace controls 
may not take effect for up to the next four instructions 
ex- 
ecuted. 


temp f- process. trace_controls; 
temp I f- 16#OOFFOOFF# 
and mask; 
process. trace_controls 
f- 


(tempI and src) or 
(process.trace30ntrols 
and not(templ)); 
dst f- temp; 


modtc 
g12, 
glO, 
g2 


# 
trace 
controls 
f- 
glO 
masked 
by 
g12; 
# previous 
trace 
controls 
stored 
in g2 


mov 
movl 
movt 
movq 


Move 
Move Long 
Move Triple 
Move Quad 


src, 
reg/lit 


Description: 
Copies the content of one or more source registers 
(specified 
with the src 


operand) 
to one 
or 
more 
destination 
registers 
(specified 
with 
the 
dst 


operand). 


For the movl, movt, and movq instructions, 
the src and dst operands specify 
the first (lowest numbered) 
register of several successive 
registers. 
The src 


and dst registers must be even numbered 
(e.g., gO, g2) for the movl instruc- 


tion and an integral multiple of four (e.g., gO, g4) for the movt and movq 
instructions. 


mov 
movl 
movt 
movq 


5CC 
5DC 
5EC 
5FC 


REG 
REG 
REG 
REG 


I movr, movre, movrll 


movr 
movrl 
movre 


Move Real 
Move Long Real 
Move Extended Real 


src, 
freg/flit 
dst 
freg 


Description: 
Copies a real value from one or more source registers (specified 
with the src 


operand) 
to 
one 
or 
more 
destination 
registers 
(specified 
with 
the 
dst 
operand). 


For the movrl instruction, 
if the src or dst operand 
references 
a global or 
local register, 
this register is the first (lowest numbered) 
of two successive 
registers. 
For the movre instruction, 
if the src or dst operand references 
a 
global or local register, 
this register is the first (lowest numbered) 
of three 
successive registers. 


When copying real numbers between 
global or local registers and floating- 
point registers, conversion 
between real or long-real format to extended-real 
format is performed 
implicitly. 
Conversion 
between real and long-real 
for- 
mats must be done through floating-point 
registers and requires two instruc- 
tions, as illustrated in the example below. 


When the movre instruction moves an operand from global or local registers 
to a floating-point 
register, it automatically 
truncates the most-significant 
16 
bits of the word in the third register (refer to Figure 12-5). 
Likewise, 
when 
this instruction 
is used to move an operand from a floating-point 
register to 
global 
or local registers, 
it adds 
16 zeros to the third word. 
The movre 
instruction is not a numeric instruction; 
it merely manipulates 
bits. 


The movr and movrl instructions 
can cause a floating-point 
exception 
to be 
raised, 
which might 
result in a fault being raised, 
as is explained 
in the 
section below on faults. 
The movre instruction can never raise an exception 
and thus never faults. 


I movr, movre, movrll 


Refer to the discussion 
of faults at the 
beginning of this chapter. 


One or more operands 
is an unnormal- 
ized (including 
denormalized) 
value and 
the normalizing-mode 
bit in the arith- 
metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 
exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Overflow 


Floating Underflow 


Floating Invalid Operation 


Floating Inexact 


Result is too large for destination 
format. 


Result is too small for destination 
format. 


Source operand is an SNaN value. 


Result cannot 
be represented 
exactly 
in 
destination format. 


# 
Conversion 
of 
real 
value 
in g3 
to 
a 


# to 
a long 
real 
value, 
which 
is 
stored 
# in g4,g5 
movr 
g3, 
fp2 


movrl 
fp2, 
g4 


movr 
movrl 
movre 


6C9 
6D9 
6E9 


REG 
REG 
REG 


muli, mula 


muli 
mulo 
Multiply Integer 
Multiply Ordinal 


srcl, 
reg/lit 
src2, 
reg/lit 


muli 
741 
mulo 
701 
REG 
REG 


inter 


I muir, mulrll 


mulr 
mulrl 
Multiply Real 
Multiply Long Real 


srcl, 
freg/flit 
dst 
freg 
src2, 
freg/flit 


For the mulrl instruction, 
if the src 1, src2, or dst operand references a global 
or local register, this register is the first (lowest numbered) of two successive 
registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The sign of the result is always the exclusive-OR 
of the source signs, even if 


one or more of the source values is 0, 00, or a NaN. 


The following 
table shows 
the results 
obtained 
when multiplying 
various 


classes of numbers together, 
assuming 
that neither overflow 
nor underflow 
occurs. 


_00 
-F 
-0 
+0 
+F 
+00 
NaN 


.00 
+00 
+00 
* 
* 
-00 
_00 
NaN 


·F 
+00 
+F 
+0 
-0 
-F 
_00 
NaN 


-0 
* 
+0 
+0 
-0 
-0 
* 
NaN 


+0 
* 
-0 
-0 
+0 
+0 
* 
NaN 


+F 
-00 
-F 
-0 
+0 
+F 
+00 
NaN 


+00 
_00 
_00 
* 
* 
+00 
+00 
NaN 


NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 


Notes: 


F 
Means finite-real number. 
Indicates floating invalid-operation exception. 


When you need to multiply by the power of 2, the scaler and scalerl 
instruc- 


tions can also be used. 


I muir, mulrll 


Refer to the discussion 
of faults 
at the 
beginning of this chapter. 


One or more operands 
is an unnormal- 
ized (including 
denormalized) 
value and 
the normalizing-mode 
bit in the arith- 
metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 
exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Overflow 


Floating Underflow 


Floating Invalid Operation 


Result is too large for destination format. 


Result is too small for destination 
format. 


One source operand is 0 and the other is 


One 
or 
more 
operands 
are 
an 
SNaN 
value. 


Result 
cannot 
be represented 
exactly 
in 
destination format. 


mulr 
mulrl 
78C 
79C 
REG 
REG 


inter 


src1 , 
reg/lit 
src2, 
reg/lit 


Description: 
Performs 
a bitwise NAND operation 
on the src2 and src1 values and stores 
the result in dst. 


srcl, 
reg/lit 


src2, 
reg/lit 


Description: 
Performs a bitwise NOR operation on the src2 and src1 values and stores the 
result in dst. 


inter 


not, notand 


Mnemonic: 
not 
Not 


notand 
Not And 


Format: 
not 
src, 
dst 
reg/lit 
reg 


notand 
srcJ, 
src2, 
dst 
reg/lit 
reg/lit 
reg 


Description: 
Performs a bitwise NOT (not instruction) 
or NOT AND (notand instruction) 


operation on the src2 and srcJ values and stores the result in dst. 


not 
g2, 
g4 
# 
g4 
~ 
NOT 
g2 


not and 
r5, 
r6, 
r7 
# r7 ~ 
NOT 
r6 AND 
r5 


not 
notand 


58A 
584 


REG 
REG 


bitpos. 
reg/lit 
src. 
reg/lit 


Description: 
Copies 
the src value 
to dst with 
one bit toggled. 
The 
bitpos 
operand 
specifies the bit to be toggled. 


# r7 f- 
r12 
with 
the 
bit 


# 
specified 
in 
r3 toggled 


srcl, 
reg/lit 


src2, 
reg/lit 


Description: 
Performs a bitwise NOT OR operation on the src2 and srcJ values and stores 
the result in dst. 


intel 


or, arnot 


or 
ornot 
Or 
Or Not 


src1 , 
src2, 
dst 
reg/lit 
reg/lit 
reg 


srcl, 
src2, 
dst 
reg/lit 
reg/lit 
reg 


Description: 
Performs a bitwise OR (or instruction) 
or aRNOT 
(ornot 
instruction) 
opera- 


tion on the src2 and src1 values and stores the result in dst. 


or 
14, 
g9, 
g3 
arnot 
r3, 
r8, 
r11 


# g3 
f- 
g9 
OR 14 


# r11 
f- 
r8 OR NOT 
r3 


or 
ornot 
587 
58B 
REG 
REG 


remi 
remo 
Remainder Integer 
Remainder 
Ordinal 


remi, remo 


Description: 
Divides src2 by srcl and stores the remainder 
in dst. 
The sign of the result 
(if nonzero) is the same as the sign of src2. 


srcl, 
reg/lit 
src2, 
reg/lit 


Refer to discussion of faults at the begin- 
ning of this chapter. 


Result is too large for destination 
format. 


This fault is signaled only when execut- 
ing the remi instruction 
and if both of 


the following 
conditions are met: 
(1) the 


integer-overflow 
mask in the arithmetic- 
controls 
registers 
is clear 
and 
(2) the 


source operands 
have like signs and the 
sign of the result 
operand 
is different 
than the signs of the source operands. 


remi 
748 
remo 
708 
REG 
REG 


I remr, remrl! 


remr 
remrl 
Remainder Real 
Remainder 
Long Real 


srcl, 
freg/flit 
src2, 
freg/flit 
dst 
freg 


Description: 
Divides src2 by srcl and stores the remainder 
in dst. 
The sign of the result 
(if nonzero) is the same as the sign of src2. 


For the remrl 
instruction, 
if the srcl, src2, or dst operand references a global 
or local register, this register is the first (lowest numbered) 
of two successive 
registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The 
following 
table 
shows 
the 
results 
obtained 
when 
computing 
the 
remainder of various classes of numbers, assuming that neither overflow nor 
underflow occurs. 


-GO 
-F 
-0 
+0 
+F 
+GO 
NaN 


-GO 
* 
* 
* 
* 
* 
* 
NaN 


-F 
src2 
-F or -0 
** 
** 
-F or -0 
src2 
NaN 


-0 
-0 
-0 
* 
* 
-0 
-0 
NaN 


+0 
+0 
+0 
* 
* 
+0 
+0 
NaN 


+F 
src2 
+For 
+0 
** 
** 
+For 
+0 
src2 
NaN 


+GO 
* 
* 
* 
* 
* 
* 
NaN 


NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 


Notes: 


F 
Means finite-real 
number. 


• 
Indicates 
floating invalid-operation 
exception. 
• • Indicates 
floating zero-divide 
exception. 


When the result is 0, its sign is the same as that of src2. 
When the srcl is 00, 


the result is equal to the src2. 


The result of this operation is always exact if the destination format is at least 
as wide as the src2 and srcl. 


inter 


I remr, remrll 


The remainder 
provided 
with the remr and remrl instructions 
is different 
from the remainder 
described 
in the IEEE floating-point 
standard. 
The dif- 
ference 
is related to how the quotient 
(N) of the expression 
(src2/srcl) 
is . 


determined. 


As shown below in the action statement, N for the remr and remrl instruc- 
tions is the nearest integer value obtained 
when the exact result (E) of the 
expression 
(src2/srcl) 
is truncated 
toward zero. 
N will always be less than 
or equal to the absolute value of E. 


For the IEEE standard, N is simply the nearest integer value to E. Here, N 
may be less than, equal to, or greater than the absolute value of E. 


To help determine the IEEE remainder from the result given by the remr and 
remrl instructions, 
the following 
information 
about the quotient 
is given in 
the arithmetic-status 
field in the arithmetic: 


Arithmetic 
Meaning 
Status Bit 


6 
Ql, the next-to-last 
quotient bit 


5 
QQ, the last quotient bit 


4 
QR, the value the next quotient bit 
would have if one more reduction were 
performed (the "round" bit of the 
quotient) 


3 
QS, set if the remainder after the QR 
reduction would be nonzero (the 
"sticky" bit of the quotient) 


The information 
can then be used to determine the IEEE standard remainder, 
as shown in the example below. 


dst ~ 
src2 - (N * srcl); 


# where N = truncate (src2/srcl. 
# Here, (src2/src1) is truncated 
# toward zero to the nearest integer. 


inter 


I remr, remrll 


Refer to the discussion 
of faults at the 


beginning of this chapter. 


One or more operands 
is an unnormal- 


ized (including 
denormalized) 
value and 


the normalizing-mode 
bit in the 
arith- 


metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 


exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Overflow 


Floating Underflow 


Floating Zero Divide 


Floating Invalid Operation 


remrl 
g6, 
g8, 
fpl 
# 
fpl 
f- 
g8,g9 
rem 
g6,g7 


remr 
remrl 
683 
693 


Result is too large for destination 
format. 


Result is too small for destination 
format. 


The srcl operand is O. 


The src2 operand is 00. 


The srcl operand is O. 


One 
or 
more 
operands 
are 
an 
SNaN 


value. 


Result cannot 
be represented 
exactly 
in 


destination format. 


REG 
REG 


Description: 
Returns 
process control to the calling procedure. 
The current 
stack frame 
(i.e., that of the called procedure) 
is deallocated 
and the FP is changed 
to 
point to the stack frame of the calling procedure. 
Instruction 
execution 
is 


continued 
at the instruction 
pointed to by the RIP in the calling procedure's 


stack frame, which is the instruction 
immediately 
following 
the call instruc- 
tion. 


As shown in the action statement 
below, the action that the processor 
takes 


on the return 
is determined 
by the return 
status and prereturn 
trace bits. 


These bits are contained in bits 0, through 3 of register rOof the current set of 
local registers. 


wait for any uncompleted 
instructions 
to finish; 


case frame_status 
is 


2#000#: 
FP ~ 
PFP; 
free current registecset; 
if register_set 
(FP) not allocated 
then retrieve from memory(FP); 
end if; 
IP ~ 
RIP; 


2#001#: 
x ~ 
memory(FP-16); 
y ~ 
memory(FP-12); 


do case 000 action; 
arithmetic_controls 
~ 
y; 
if execution_mode 
= supervisor 


then process_controls 
~ 
x; 
end if; 


2#010#: 
if execution_mode"# 
supervisor 
then go to case 000; 
else process_controls.T 
~ 
0; 


execution_mode 
~ 
user; 


go to case 000; 


end if; 


inter 


2#0 11#: 
if execution_mode * supervisor 
then go to case 000; 
else process30ntrols. 
T f- 1; 
execution_mode 
f- user; 
go to case 000; 
end if; 


2#110#: 
if execution_mode 
= supervisor 
then free current register set; 
check_pending_interrupts; 
# if continue here, no interrupt to do 
do case 000 action; 
end if; 


2#111#: 
x f- memory(FP-16); 
y f- memory(FP-12); 
do case 000 action; 
arithmetic30ntrols 
f- y; 
if execution_mode 
= supervisor 
then process30ntrols 
f- x; 
check_pending_interrupts; 
end if; 


# process 
control 
returns 
to 
# calling 
procedure 
# environment 


inter 


fen, 
reg/lit 
src, 
reg/lit 


Description: 
Copies src to dst and rotates the bits in the resulting dst operand to the left 
(toward higher significance). 
(The bits shifted off the left end of the word 
are inserted 
at the right end of the word.) 
The 
fen operand 
specifies 
the 
number of bits that the dst operand is rotated. 
The len operand can range 
from 0 to 31. 


This instruction can also be used to rotate bits to the right. 
Here, the number 
of bits the word is to be rotated right is subtracted 
from 32 to get the fen 
operand. 


# r12 
f- 
r8 


# with 
bits 
rotated 


# r4 bits 
to 
left 


inter 


I roundr, roundrll 


roundr 
roundrl 
Round Real 
Round Long Real 


src, 
freg/flit 
dst 
freg 


Description: 
Rounds src to the nearest integral value, depending 
on the rounding 
mode, 


and stores the result in dst. 


For the roundrl instruction, 
if the src or dst operand references 
a global or 


local register, this register is the first (lowest numbered) 
of two successive 


registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


If the src operand is 00 the result is src. 
If the src operand is not an integral 


value, a floating-inexact 
exception is raised. 


Refer to the discussion 
of faults 
at the 


beginning of this chapter. 


One or more operands 
is an unnormal- 
ized (including 
denormalized) 
value and 
the normalizing-mode 
bit in the arith- 


metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 
exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Overflow 


Floating Underflow 


Floating Invalid Operation 


Result is too large for destination format. 


Result is too small for destination 
format. 


One 
or 
more 
operands 
are 
an 
SNaN 
value. 


Result cannot 
be represented 
exactly 
in 
destination format. 


roundrl 
r4, 
rlO 


# rlO,rll 
f- 
r4,r5 
rounded 


roundr 
roundrl 
68B 
69B 
REG 
REG 


inter 


I scaler, scalerll 


scaler 
scaler! 
Scale Real 
Scale Long Real 


srcl, 
regnit 


src2, 
freg/flit 


dst 
freg 


Description: 
Multiplies 
src2 by 2 to the power of srcl 
and stores the result in dst. 
The 


srcl operand is an integer; whereas, src2 and dst are reals. 


For the scalerl 
instruction, 
if the src2 or dst operand references 
a global or 
local register, 
this register is the first (lowest numbered) 
of two successive 
registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The following 
table shows the results obtained when scaling various classes 


of numbers, assuming that neither overflow nor underflow occurs. 


-N 
0 
+N 


-co 
_00 
_00 
.co 


·F 
-F 
-F 
-F 


-0 
-0 
-0 
-0 


+0 
+0 
+0 
+0 


+F 
+F 
+F 
+F 


+co 
+00 
+00 
+00 


NaN 
NaN 
NaN 
NaN 


Notes: 


F 
Means finite-real number. 


N 
Means integer. 


In most cases, 
only the exponent 
is changed 
and the mantissa 
(fraction) 


remains 
unchanged. 
However, 
when the srcl 
operand 
is a denormalized 
value, the mantissa 
is also changed 
and the result may turn out to be a 
normalized 
number. 
Similarly, 
if overflow or underflow results from a scale 
operation, the resulting mantissa will differ from the source's 
mantissa. 


intel 


I scaler, scalerll 


Refer to the sections 
titled 
"Floating 
Overflow 
Exception" 
and "Floating 


Underflow 
Exception" 
in Chapter 
12 for further discussion 
of how overflow 


and underflow are handled. 


Refer to the discussion 
of faults at the 


beginning of this chapter. 


One or more operands 
is an unnormal- 


ized (including 
denormalized) 
value and 


the normalizing-mode 
bit in the arith- 


metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 


exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Overflow 


Floating Underflow 


Floating Zero Divide 


Floating Invalid Operation 


scalerl 
g6, 
g2, 
fpO 
# 
fpO 
f- 
g2,g3 
* 2Ag6 


scaler 
scaler! 
677 
676 


Result is too large for destination format. 


Result is too small for destination format. 


The src1 operand is O. 


One 
or 
more 
operands 
are 
an 
SNaN 


value. 


Result 
cannot 
be represented 
exactly 
in 


destination format. 


REG 
REG 


inter 


src, 
reg/lit 


Description: 
Searches 
the src value for the most-significant 
set bit (l bit). 
If a most- 
significant 
1 bit is found, its bit number 
is stored in dst and the condition 
code is set to 0102, 
If the src value is zero, all 1's are stored in dst and the 
condition code is set to 0002, 


Action: 
dst ~ 
16#FFFFFFFF#; 


AC.cc ~ 
2#000#; 


for i in 31..0 reverse 
loop 
if (src and 2Ai) # 0 
then 


dst~i; 
AC.cc ~ 
2#010#; 


exit; 


# assume 
g8 
is nonzero 
scanbit 
g8, 
g10 


# g10 
~ 
bit 
number 
of 
# most-significant 
set 
bit 
# in g8; 
AC.cc 
~ 
2#010# 


inter 


scanbyte 


scanbyte 
srcJ , 
reg/lit 


src2 
reg/lit 


Description: 
Performs 
a byte-by-byte 
comparison 
of srcJ and src2 and sets the condition 
code to 2#010# 
if any two corresponding 
bytes are equal. 
If no correspond- 


ing bytes are equal, the condition 
code is set to 0002, 


Action: 
if (srcJ and 16#OOOOOOFF#) = (src2 and 16#OOOOOOFF#) or 
(srcJ and 16#0000FFOO#) 
= (src2 and 16#OOOOFFOO#) or 


. (srcJ and 16#OOFFOOOO#) = (src2 and 16#OOFFOOOO#) or 
(srcJ and 16#FFOOOOOO#) = (src2 and 16#FFOOOOOO#) 


then AC.cc 
f- 2#010#; 
else AC.cc 
f- 2#000#; 


endif; 


Example: 
# assume 
r9 
= OxllABllOO 
scanbyte 
OxOOAB0011, 
r9 


# AC.cc 
f- 
2#010# 


inter 


bitpos, 
reg/li~ 
src, 
reg/lit 


Description: 
Copies the src value to dst with one bit set. The bitpos operand specifies the 
bit to be set. 


Example: 
setbit 
15, 
r9, 
r1 


# r1 f- 
r9 with 
bit 
15 
set 


inter 


shlo 
shro 
shli 
shri 
shrdi 


Shift Left Ordinal 
Shift Right Ordinal 
Shift Left Integer 
Shift Right Integer 
Shift Right Dividing Integer 


Zen, 
reg/lit 
sre, 
reg/lit 


Description: 
Shifts sre left or right by the number of digits indicated with the Zen operand 
and stores the result in dst. 
This operation 
(with the exception 
of the shri 
instruction, 
as described 
below) is e~uivalent 
to multiplying 
(shift left) or 


dividing (shift right) the sre value by 2 en. 


The shri 
instruction 
performs 
a conventional 
arithmetic 
right shift, which, 


when used as a divide, produces an incorrect quotient for negative sre values. 
To get a correct quotient for a negative sre value, use the shrdi 
instruction, 


which performs correct rounding of negative results. 


if Zen < 32 
then dst ~ 
sre* 2"Zen 
else dst ~ 
0; 
end if; 


if Zen < 32 
then dst ~ 
sre/2"Zen 
else dst ~ 
0; 
end if; 


if src ~ 0 
then if Zen < 32 
then dst ~ src/2"Zen 
else dst ~ 
0; 
else if Zen < 32 
then dst ~ 
(sre - 2"Zen + 1)/2"Zen 
else dst ~ 
-1; 
end if; 


end if; 


inter 


Faults: 
STANDARD, 
Integer Overflow 


Example: 
shli 
13, g4, 
r6 


# g6 
f- 
g4 shifted 
left 
13 bits 


Opcode: 
shlo 
59C 
REG 
shro 
598 
REG 
shli 
59E 
REG 
shri 
59B 
REG 
shrdi 
59A 
REG 


See Also: 
divi, muli, rotate 


inter 


I sinr, sinrll 


Mnemonics: 
sinr 
sinrl 
Sine Real 
Sine Long Real 


src, 
freg/flit 
dst 
freg 


Description: 
Calculates 
the sine of src and stores the result in dst. 
The src value is an 
angle given in radians. 
The resulting 
dst value is in the range -1 to +I, 


inclusive. 


For the sinrl instruction, 
if the src or dst operand references a global or local 
register, 
this 
register 
is the 
first 
(lowest 
numbered) 
of two 
successive 
registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The following 
table 
shows 
the results 
obtained 
when 
taking 
the sine of 
various classes of numbers, 
assuming 
that neither overflow 
nor underflow 
occurs. 


Src 
Dst 


-00 
* 
-F 
-1to + 1 


-0 
-0 


+0 
+0 


+F 
-1to + 1 


+00 
* 
NaN 
NaN 


Notes: 


F 
Means finite-real number 
Indicates floating invalid-operation 
exception 


In the trigonmetic 
instructions, 
the 80960KB uses a value for 1t with a 66-bit 
mantissa which is 2 bits more than are available in the extended-real 
format. 


The section in Chapter 
12 titled "Pi" gives this 
1t value, along with some 
suggestions for representing 
this value in a program. 


inter 


I sinr, sinrll 


Refer to the discussion 
of faults at the 
beginning of this chapter. 


One or more operands 
is an unnormal- 
ized (including 
denormalized) 
value and 
the normalizing-mode 
bit in the arith- 
metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 
exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Underflow 


Floating Invalid Operation 


sinrl 
g6, 
gO 


# sine 
of value 
in g6,g7 


# is 
stored 
in gO,gl 


sinr 
sinrl 
68C 
69C 
REG 
REG 


Result is too small for destination 
format. 


The src operand is 00. 


One or more operands is an SNaN value. 


Result cannot 
be represented 
exactly 
in 
destination format. 


inter 


spanbit 


src, 
reg/lit 


Description: 
Searches the src value for the most-significant 
clear bit (0 bit), 
If a most- 
significant 
0 bit is found, its bit number 
is stored in dst and the condition 
code is set to 0102, 
If the src value is all I 's, all l's are stored in dst and the 
condition code is set to 0002, 


Action: 
dst f- 16#FFFFFFFF#; 
AC.cc f- 2#000#; 
for i in 31..0 reverse 
loop 
if (src and 21\i) = 0 
then 
dst f- i; 
AC.cc f- 2#010#; 
exit; 
end if; 
end loop; 


Example: 
# assume 
r2 
is not 
16#FFFFFFFF# 
spanbit 
r2 
r9 
# r9 f- 
bit 
number 
of 
# most-significant 
clear 
bit 
# in 
r2; AC.cc 
f- 
2#010# 


intel° 


[!grtr, sCl!!!!] 


sqrtr 
sqrtrl 
Square Root Real 
Square Root Long Real 


dst 
freg 


src, 
freg/flit 


For the sqrtrl instruction, 
if the src or dst operand 
references 
a global or 
local register, this register is the first (lowest numbered) 
of two successive 
registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The following 
table shows the results obtained when taking the square root 
of various classes of numbers, assuming that neither overflow nor underflow 
occurs. 


Src 
Dst 


-00 
* 


-F 
* 
-0 
-0 
+0 
+0 
+F 
+F 


+00 
+00 


NaN 
NaN 


Notes: 


F 
Means finite-real number 
Indicates floating invalid-operation 
exception 


With these instructions, 
it is not possible 
to raise a floating 
overflow 
or 
floating underflow 
fault unless the src operand is in a floating-point 
register 
and the dst operand is not. 


inter 


~rtr, 
sCi!!!!] 


Refer to the discussion 
of faults 
at the 
beginning of this chapter. 


One or more operands 
is an unnormal- 
ized (including 
denormalized) 
value and 
the normalizing-mode 
bit in the arith- 
metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 
exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Overflow 


Floating Undefflow 


Floating Invalid Operation 


sqrtrl 
g6, 
fpO 


# 
fpO 
f- 
sqrt 
of 
g6,g7 


sqrtr 
sqrtrl 
688 
698 
REG 
REG 


Result is too large for destination format. 


Result is too small for destination 
format. 


The src operand is less than -0. 


The src operand is an SNaN value. 


Result cannot 
be represented 
exactly 
in 
destination format. 


intel~ 


st 
stob 
stos 
stib 
stis 
stl 
stt 
stq 


Store 
Store Ordinal Byte 
Store Ordinal Short 
Store Integer Byte 
Store Integer Short 
Store Long 
Store Triple 
Store Quad 


src, 
reg/lit 
dst 
mem 


Description: 
Copies 
a byte or string of bytes from a register 
or group 
of registers 
to 
memory. 
The src operand specifies a register or the first (lowest numbered) 
register of successive registers. 


The dst operand specifies the address of the memory location where the byte 
or the first byte of a string of bytes is to be stored. 
The full range 
of 
addressing 
modes may be used in specifying 
dSf. 
(Refer to Chapter 5 for a 
complete 
discussion 
of the addressing 
modes 
available 
with memory-type 
operands.) 


The stob and stib, and stos and stis instructions 
store a byte and half word, 
respectively, 
from the low order bytes. of the src register. 
The st, stl, stt, and 
stq instructions 
copy 4, 8, 12, and 16 bytes, respectively, 
from successive 
registers to memory. 


For the stl instruction, 
dst must specify an even numbered 
register (e.g., gO, 
g2, ..., g 12). 
For the stt and stq instructions, 
dst must specify 
a register 
number that is a multiple of four (e.g., gO, g4, g8). 


st g2, 
1256 
(g6) 
# word 
beginning 
at 
offset 
# 
1256 
+ 
(g6) f- 
g2 


inter 


st 
stob 
stos 
stib 
stis 
stl 
stt 
stq 


92 
82 
8A 
C2 
CA 
9A 
A2 
B2 


MEM 
MEM 
MEM 
MEM 
MEM 
MEM 
MEM 
MEM 


intel" 


srcl, 
reg/Iit 
src2, 
reg/Iit 


Description: 
Subtracts (srcl - 1) from src2, adds bit 1 of the condition code (used here as 
a carry bit), and stores the result in dst. If the ordinal subtraction 
results in a 
carry, bit I of the condition code is set. 


This instruction 
can also be used for integer subtraction. 
Here, if integer 
subtraction results in an overflow, bit 0 of the condition code is set. 


The subc instruction 
does not distinguish 
between ordinals and integers: 
it 
sets bits 0 and 1 of the condition code regardless of the data type. 


# Let the value of the condition code be xCx. 
dst f- src2 - (srcl - 1) + C; 
AC.cc f- 
2#OCY#; 
# C is carry from ordinal subtraction. 
# Y is 1 if integer subtraction would have generated 
# an overflow. 


subc 
g5, 
g6, 
g7 


# 
g7 
f- 
g6 
- 
(g5 - 1) 
# + Carry 
Bit 


subi, subo 


subi 
subo 
Subtract Integer 
Subtract Ordinal 


srcl, 
reg/lit 
src2, 
reg/lit 


Description: 
Subtracts srcl from src2 and stores the result in dst. The binary results from 
these two instructions 
are identical. 
The only difference 
is that subi 
can 
signal an integer overflow. 


subi 
subo 
593 
592 
REG 
REG 


inter 


I subr, subrl! 


subr 
subrl 
Subtract Real 
Subtract Long Real 


srcl, 
freg/flit 
src2, 
freg/flit 
dst 
freg 


For the subrl 
instruction, 
if the srcl, src2, or dst operand references a global 
or local register, this register is the first (lowest numbered) 
of two successive 
registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The following 
table 
shows 
the results 
obtained 
when subtracting 
various 
classes of numbers, assuming that neither overflow nor underflow occurs. 


-00 
-F 
-0 
+0 
+F 
+00 
NaN 


_00 
* 
-00 
-00 
-00 
-00 
_00 
NaN 


-F 
+00 
±For 
±O 
src2 
src2 
-F 
_00 
NaN 


-0 
+00 
srcl 
±O 
-0 
srcl 
_00 
NaN 


+0 
+00 
srcl 
+0 
±O 
srcl 
_00 
NaN 


+F 
+00 
+F 
src2 
src2 
±For± 
0 
_00 
NaN 


+00 
+00 
+00 
+00 
+00 
+00 
* 
NaN 


NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 
NaN 


F 
Means finite-real number. 
Indicates floating invalid-operation 
exception. 


When the difference 
between two operands of like sign is zero, the result is 


+0, except for the round toward 
-00 mode, in which case the result is -0. This 
instruction also guarantees that +0 - (-0) = +0, and that -0 - (+0) = -0. 


When one source operand is 00, the result is 00 of the expected sign. 
If both 
source operands 
are 
00 of the same sign, an invalid-operation 
exception 
is 


raised. 


I subr, subrll 


Refer to the discussion 
of faults 
at the 


beginning of this chapter. 


One or more operands 
is an unnormal- 
ized (including 
denormalized) 
value and 


the normalizing-mode 
bit in the 
arith- 


metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 


exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Overflow 


Floating Underflow 


Floating Invalid Operation 


subrl 
g6, 
fpO, 
fpl 


# 
fpl ~ 
fpO 
- g6, g7 


subr 
subrl 
78D 
79D 
REG 
REG 


Result is too large for destination 
format. 


Result is too small for destination 
format. 


Source 
operands 
are 
infinities 
of 
like 


sign. 


One 
or 
more 
operands 
are 
an 
SNaN 


value. 


Result 
cannot 
be represented 
exactly 
in 


destination format. 


inter 


syncf 


Description: 
Waits for any faults to be generated associated with any prior uncompleted 
instructions. 


Action: 
if arithmetic_controls.nif 
then; 
else wait until no imprecise faults can occur 
associated with any uncompleted instructions; 


ld xyz, 
g6 
addi 
r6, 
rB, 
rB 
syncf 
and 
g6, 
OxFFFF, 
gB 
# 
the 
syncf 
instruction 
insures 
that 
any 
faults 
# 
that 
may 
occur 
during 
the 
execution 
of the 
# ld and 
addi 
instructions 
occur 
before 
the 
# and 
instruction 
is executed 


inter 


Isynld 
I 


src, 
reg 
addr 


dst 
reg 
addr 


Description: 
Copies 
a word from the memory 
location 
specified 
with src into dst and 


waits for the completion 
of all memory operations, 
including 
those initiated 


prior to the synld 
instruction. 
When the load has been successfully 
com- 


pleted, the condition code is set to 2#0 10#. 


The primary 
function 
of this instruction 
is for reading 
lAC messages, 
the 


lAC 
Message 
Control 
word, 
or 
the 
lAC 
Interrupt 
Control 
Register. 


However, 
this instruction 
is not restricted 
to lAC applications. 
It may be 


used when it is important 
to guarantee 
the completion 
of the load operation 
before proceeding 
or to avoid a bad-access fault. 


The setting 
of the condition 
code indicates 
whether 
or not the load was 


completed 
successfully. 
If the load operation 
results in a bad access con- 
dition (e.g., reading an AP-bus interconnect 
register), 
the condition 
code is 


set to 0002, but the bad-access fault is not raised. 


if PRCB.addressing_mode 
= physical 
then tempa ~ 
src; 


else tempa ~ 
physical_address 
(src); 
end if; 
tempa ~ 
tempa and 16#FFFFFFFC#; 
# force alignment 
if tempa = 16#FFOOOO04# 
then dst ~ 
interrupccontrol_reg; 
AC.cc ~ 
2#010#; 


else dst ~ 
memory (tempa); 
if bad_access 
then AC.cc ~ 
2#000#; 
else AC.cc ~ 
2#010#; 


end if; 
end if; 


intel~ 


lda 
16#FF000010#, 
g8 
synld 
g8, 
g9 
# g9 
~ 
word 
from 
lAC 


# message 
buffer; 


# AC.cc 
= 2#010# 


I synld I 


I synmov, synmovl, synmovgj 


synmov 
synmovl 
synmovq 


Synchronous 
Move 
Synchronous 
Move Long 
Synchronous 
Move Quad 


dst, 
reg 
addr 


src 
reg 
addr 


Description: 
Copies 
1 (synmov), 
2 (synmovl), 
or 4 (synmovq) 
words from the memory 


location 
specified 
with src to the memory 
location 
specified 
with dst and 
waits for the completion 
of all memory operations, 
including 
those initiated 
prior to this instruction. 
When the move has been successfully 
completed, 


the condition code is set to 0102, 


The src and dst operands 
specify the address 
of the first (lowest 
address) 


word. 
These addresses 
should be for word boundaries 
(synmov), 
double- 
word boundaries 
(synmovl), 
or quad-word 
boundaries 
(synmovq). 
If not, 


the processor forces alignment to these boundaries. 


The primary 
function 
of these instructions 
is for sending 
lAC 
messages. 


However, 
this instruction 
is not restricted 
to lAC applications. 
It may be 


used when it is important to guarantee the completion 
of the move operation 
before proceeding 
or to avoid a Bad Access Fault. 


The setting of the condition 
code indicates 
whether 
or not the move was 


completed 
successfully. 
If the move operation 
results in a bad access con- 
dition (e.g., sending an lAC message to a non-existent 
agent on the AP-bus), 


the condition code is set to 0002, but the Bad Access Fault is not raised. 


Address FFOOOO1016is used to send an lAC message to the processor 
upon 
which the instruction 
is executed. 
Refer to Chapter 
11 for further 
infor- 
mation about sending internal lAC messages. 


jsynmov, synmovl, synmovgj 


if PRCB.addressing_mode 
= physical 
then tempa ~ dst; 
# dst is used as a physical address 
else tempa ~ physical_address 
(dst); 
# dst translated into a physical address 
end if; 
tempa ~ 
tempa and 16#FFFFFFFC#; 
# force alignment 
if tempa = 16#FF000004# 
then interrupccontrol_reg 
~ 
memory (src) 
AC.cc ~ 
2#010#; 
else temp ~ 
memory (src); 
memory (tempa) ~ 
temp; 
# write operations into memory (tempa) are 
# interpreted as noncacheable 
wait for completion; 
if bad_access 
then AC.cc ~ 
2#000#; . 


else AC.cc ~ 
2#010#; 
end if; 
end if; 


if PRCB.addressing_mode 
= physical 
then tempa ~ dst; 
# dst is used as a physical address 
else tempa ~ 
physical_address 
(dst); 
# dst is translated into as a physical address 
end if; 
tempa ~ 
tempa and 16#FFFFFFF8#; 
# force alignment 
temp ~ 
memory (src); 
memory (tempa) ~ 
temp; 
# write operations into memory (tempa) are interpreted 
# as noncacheable 
wait for completion; 
if bad_access 
then AC.cc ~ 
2#000#; 
else AC.cc ~ 
2#010#; 
end if; 


I synmov, synmovl, synmovgj 


if PRCB.addressing_mode 
= physical 
then tempa f- dst; 
# dst is used as a physical address 
else tempa f- physical_address 
(dst); 
# dst is translated into as a physical address 
end if; 
tempa f- tempa and 16#FFFFFFFO#; # force alignment 
temp f- memory (src); 
if tempa = 16#FF000010# 
then AC.cc f- 2#010#; 
use temp as a received iac message; 
else memory (tempa) f- temp; 
# write operations into memory (tempa) are interpreted 
# as noncacheable 
wait for completion; 
if bad_access 
then AC.cc f- 2#000#; 
else AC.cc f- 2#010#; 


end if; 


end if; 


Ida 
16#FFOOOOIO#, 
g7 
# g7 
f- 
16#FFOOOOIO 
synmovq 
g7, 
g8 


# 
g7 
f- 
lAC message 
from 
g8 


synmov 
600 
synmovl 
601 
synmovq 
602 


REG 
REG 
REG 


inter 


I tanr, tanrll 


Mnemonics: 
tanr 
tanrl 
Tangent Real 
Tangent Long Real 


src, 
freg/flit 
dst 
freg 


Description: 
Calculates 
the tangent of src and stores the result in dst. The src value is an 
angle given in radians. 
The resulting dst value is in the range of 
;00 to +00, 
inclusive; 
a result of 
-00 or 
+00 will result in a floating 
invalid-operation 
exception being signaled. 


For the tanrl instruction, 
if the src or dst operand references a global or local 
register, 
this 
register 
is the 
first 
(lowest 
numbered) 
of two 
successive 
registers. 
Also, this register must be even numbered (e.g., gO, g2, g4). 


The following 
table shows the results obtained 
when taking the tangent of 
various classes of numbers, 
assuming 
that neither overflow 
nor underflow 
occurs. 


Src 
Dst 


-00 
* 
-F 
-Fto +F 
-0 
-0 
+0 
+0 
+F 
-Fto +F 


+00 
* 
NaN 
NaN 


Notes: 
F 
Means finite-real number 
Indicates floating invalid-operation 
exception 


If the source operand is a finite value, the result will be finite, unless the src 
operand is in a floating-point 
register and the dst operand is not. 


In the trigonmetic 
instructions, 
the 80960KB uses a value for 1t with a 66-bit 
mantissa which is 2 bits more than are available in the extended-real 
format. 


The section in Chapter 
12 titled "Pi" gives this 
1t value, along with some 
suggestions for representing 
this value in a program. 


Itanr, tanrll 


Refer to the discussion 
of faults 
at the 
beginning of this chapter. 


One or more operands 
is an unnormal- 
ized (including 
denormalized) 
value and 
the normalizing-mode 
bit in the arith- 
metic controls is set. 


The following 
floating-point 
exceptions 
can be raised. 
Whether 
or not an 
exception results in a fault being raised depends on the state of its associated 
mask bit in the arithmetic controls. 


Floating Overflow 


Floating Underflow 


Floating Invalid Operation 


Result is too large for destination format. 


Result is too small for destination 
format. 


The src operand is 00. 


One 
or 
more 
operands 
are 
an 
SNaN 
value. 


Result 
cannot 
be represented 
exactly 
in 
destination format. 


# tangent 
of value 
in g4,g5 
is 
# 
stored 
in 
fpO 


tanr 
tanrl 
68E 
69E 
REG 
REG 


teste 
testne 
testl 
testle 
testg 
testge 
testo 
testno 


Test For Equal 
Test For Not Equal 
Test For Less 
Test For Less or Equal 
Test For Greater 
Test For Greater or Equal 
Test For Ordered 
Test For Unordered 


Description: 
Stores a true (1) in dst if the logical AND of the condition 
code and the 
mask-part 
of the opcode is not zero. Otherwise, 
the instruction 
stores a false 


(0) in dst. 


Instruction 
Mask 
Condition 


testno 
000 
Unordered 


testg 
001 
Greater 


teste 
010 
Equal 


testge 
011 
Greater or equal 


testl 
100 
Less 


testne 
101 
Not equal 


testle 
110 
Less or equal 


testo 
111 
Ordered 


For the testno 
instruction 
(Unordered), 
a true is stored if the condition 
code 
is 2#000#; otherwise a false is stored. 


intel 


if (mask and AC.cc) 
:#= 2#000# 
then dst ~ 
I; # dst set for true 
else dst ~ 
0; # dst set for false 


if AC.cc = 2#000# 
then dst ~ 
1; # dst set for true 
else dst ~ 
0; # dst set for false 


# assume 
AC.cc 
= 2#100# 
testl 
g9 
# g9 
~ 
16#00000001# 


teste 
testne 
testl 
testle 
testg 
testge 
testo 
testno 


COBR 
COBR 
COBR 
COBR 
COBR 
COBR 
COBR 
COBR 


inter 


Mnemonic: 
xnor 
Exclusive Nor 
xor 
Exclusive Or 


Format: 
xnor 
srcl, 
src2, 
dst 
reg/lit 
reg/lit 
reg 


xor 
srcl, 
src2, 
dst 
reg/lit 
reg/lit 
reg 


Description: 
Performs 
a bitwise 
XNOR 
(xnor 
instruction) 
or XOR 
(xor 
instruction) 
operation on the src2 and srcl values and stores the result in dst. 


dst f- not (src2 or srcl) or 
(src2 and srcl); 


xor: 
dst f- (src2 or srcl) and 
not (src2 and srcl); 


xnor 
r3, 
r9, 
r12 
xor 
gl, 
g7, 
g4 
# r12 
f- 
r9 XNOR 
r3 
# 
g4 
f- 
g7 
XOR 
gl) 


xnor 
xor 
589 
586 
REG 
REG 


This section describes 
the floating-point 
processing 
capabilities 
of the 80960KB processor. The 
subjects discussed include the real number data types, the execution environment 
for loating-point 
operations, 
the floating-point 
instructions, 
and fault and exception handling. 


The floating-point 
architecture 
used in the 80960KB processor is designed to allow a convenient 
implementation 
of the IEEE Standard 754-1985 for Binary Floating-PointArithmetic. 
This hardware 
architecture, 
along with a small amount of software support, conforms to the IEEE standard and 
provides support for the following data structures and operations: 


Real (32-bit), long real (64-bit), and extended real (80-bit) floating-point 
number formats. 


Add, subtract, multiply, divide, square root, remainder, and compare operations 


Conversion 
between integer and floating-point 
formats 


Conversion 
between different floating-point 
fo~ats 


Handling of floating-point 
exceptions, 
including non-numbers 
(NaNs) 


The software to support the 80960KB 
floating-point 
architecture 
is needed primarily 
to handle 
conversions 
between real numbers and decimal strings. 


In addition, the 80960KB floating-point 
architecture supports several functions that go beyond the 
IEEE standard. 
These functions fall into two categories: 


functions recommended 
in the appendix to the IEEE standard, such as copy sign and classify, 
and 


commonly used transcendental 
functions, including trigonometric, 
logarithmic, 
and exponen- 
tial functions. 


This section provides an introduction to real numbers and how they are represented in floating-point 
format. Readers who are already familiar with numeric processing techniques and the IEEE standard 
may wish to skip this section. 


As shown at the top of Figure 23, the real-number system comprises the continuum of real numbers 
from minus infinity (-00) to plus infinity (+00). 


inter 


Because the size and number of registers that any computer 
can have is limited, only a subset of the 
real-number 
continuum can be used in real-number calculations. 
As shown at the bottom of Figure 
23, the subset of real numbers that a particular processor supports represents an approximation 
of the 
real number system. The range and precision of this real-number subset is determined by the format 
that the processor uses to represent real numbers. 


SUBSET 
OF BINARY 
REAL-NUMBERS 
THAT CAN BE REPRESENTED 
WITH 
IEEE SINGLE-PRECISION 
(32-BIT) 
FLOATING-POINT 
FORMAT 


-10 
-1 
0 
1 
,......~ 
I 
I···· .. ···1..···..+···_..1·..t·..\ 


To increase the speed and efficiency of real number computations, 
computers or numeric processors 


typically represent real numbers in a binary floating-point 
format. In this format, a real number has 


three parts: a sign, a significand, and an exponent. 
Figure 24 shows the binary floating-point 
format 
that the processor uses. This format conforms to the IEEE standard. 


The sign is a binary value that indicates whether the number is positive (0) or negative (l). 
The 


significand has two parts: a one-bit binary integer (also referred to as the j-bit) and a binary fraction. 
The j-bit is often not represented, 
but instead is an implied value. The exponent is a binary integer 


that represents the base-2 power that the significand 
is raised to. 


SIGN 


O========EX=P=O=NE=N=r=========================S=IG=N=IF=IC=AN=D============== 
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Table 16 shows how the real number201.187 
(in ordinary decimal format) is stored in floating-point 
format. 
The table lists a progression 
of real number notations 
that leads to the format that the 
80960KB processor uses. In this format, the binary real number is normalized 
and the exponent is 


biased. 


NOTATION 
VALUE 


ORDINARY DECIMAL 
201.187 


SCIENTIFIC DECIMAL 
2.01187E102 


SCIENTIFIC BINARY 
1.1001001001011111 
E2111 


SCIENTIFIC BINARY 
1.1001001001011111 
E210000110 
(BIASED EXPONENT) 


32-BIT 
SIGN 
BIASED EXPONENT 
SIGNIFICAND 
FLOATI NG-POI NT 
FORMAT 
0 
10000110 
1001001001011111 
(NORMALIZED) 
i....:....:....1• 
(IMPLIED) 


In most cases, the processor represents real numbers in normalized form. This means that except for 
zero, the significand is always made up of an integer of 1 and a fraction as follows: 


For values less than 1, leading zeros are eliminated. 
(For each leading zero eliminated, the exponent 
is decremented 
by one.) 


Representing 
numbers in normalized form maximizes the number of significant digits that can be 


accommodated 
in a significand of a given width. To summarize, a normalized real number consists 
of a normalized significand that represents a real number between 1and 2 and an exponent that gives 
the number's 
binary point. 


The processorrepresents 
exponents in a biased form. This means that a constant is added to the actual 
exponent so that the biased exponent is always a positive number. The value of the biasing constant 
depends on the number of bits available forrepresenting 
exponents in the floating-point format being 


used. The biasing constant is chosen so that the smallest normalized 
number can be reciprocated 
without overflow. 


The real numbers that are encoded in the floating-point format described above are generally divided 
into three classes: +0, +nonzero-finit 
number, and +00.Encodings 
for non-numbers 
(NaNs) are also 


defined. 
The term NaN stands for "Not a Number." Figure 25 shows how the encodings for these 


numbers and non-numbers fit into the real number continuum. 
The encodings shown here are for the 
IEEE single-precision 
(32-bit) format, where the term "s" indicates the sign bit, "e" the biased 
exponent, and "f' the fraction. (The exponent values are given in decimal.) 


Zero can be represented as a +0 or a -0 depending on the sign bit. Both encodings are equal in value. 
The sign of a zero result depends on the operation being performed and the rounding mode being 
used. Signed zeros have been provided to aid in implementing 
interval arithmetic. The sign of a zero 


may indicate the direction from which underflow occurred, or it may indicate the sign of an 00that 
has been reciprocated. 


The class of signed, nonzero, finite values is divided into two groups: normalized and denor-malized. 
The normalized 
finite numbers comprise all the nonzero finite values that can be 
encoded 
in a 
normalized real number format from zero to 00.In the 32-bit form shown in Figure 25, this group of 
numbers includes all the numbers with biased exponents 
ranging from 1 to 254'0 (unbiased, 
the 
exponent range is from -12610 to +12710). 


NOTES: 


1. SIGN 
BIT IGNORED 
2. 
FRACTIONS 
MUST 
BE NONZERO 


When real numbers become very close to zero, the normalized-number 
format can no longer be used 


to represent the numbers. This is because the range of the exponent is not large enough to compensate 
for shifting the binary point to the right to eliminate leading zeros. 


When the biased exponent is zero, smaller 
numbers can only be represented by making the integer 


bit (and perhaps other leading bits) of the significand 
zero. 
The numbers in this range are called 
denormalized 
numbers. 
The use of leading zeros with denormalized 
numbers 
allows 
smaller 
numbers to be represented. 
However, this denormalization 
causes a loss of precision (the number 
of significant bits in the fraction is reduced by the leading zeros). 


When performing 
normalized 
floating-point 
computations, 
a processor normally operates on nor- 


malized numbers and produces normalized numbers as results. Denormalized 
numbers represent an 
underflow condition. 


A denormalized 
number is computed through a technique called gradual underflow. Table 17 gives 
an example of gradual underflow in the denormalization 
process. Here the 32-bit format is being 


used, so the minimum exponent (unbiased) 
is -12610• The 
true result in this example requires an 


inter 


exponent of -129\0 in order to have a normalized 
number. 
Since -129\0 is beyond the allowable 
exponent range, the result is denormalized 
by inserting leading zeros until the minimum exponent 
of -126\0 is reached. 


Operation 
Sign 
Exponent* 
Significand 


True Result 
0 
-129 
1.01011100 ...00 


Denormalize 
0 
-128· 
0.101011100 ...00 


Denormalize 
0 
-127 
0.0101011100 ...00 


Denormalize 
0 
-126 
0.00101011100 
...00 


Denormal Result 
0 
-126 
0.00101011100 ...00 


In the extreme case, all the significant bits are shifted out to the right by leading zeros, creating a zero 
result. 


The two infinities, 
+00 
and 
-00, represent 
the maximum 
positive 
and negative 
real numbers, 
respectively, that can be represented 
in the floating-point 
format. 
Infinity is always represented 
by 
a zero fraction and the maximum biased exponent allowed in the specified format (e.g., 25510 for the 
32-bit format). 


Whereas 
denormalized 
numbers 
represent 
an underflow 
condition, 
the two infinity 
numbers 
represent the result of an overflow 
condition. 
Here, the normalized 
result of a computation 
has a 
biased exponent greater than the largest allowable exponent for the selected result format. 


Since NaNs are non-numbers, 
they are not part of the real number line. In Figure 25, the encoding 


space for NaNs in the 80960KB floating-point 
formats is shown above the ends of the real number 


line. This space includes any value with the maximum allowable biased exponent and a non-zero 
fraction. 
(The sign bit is ignored for NaNs.) 


The IEEE standard defines two specific NaN values: 
a quiet NaN (QNaN) and a signaling NaN 
(SNaN). AQNaN is a NaN with the most significant fraction bit set; an SNaN is a NaN with the most 
significant bit clear. QNaNs are allowed to propagate through most arithmetic operations without 
signaling 
an exception. 
SNaNs signal an invalid-operation 
exception 
whenever 
they appear as 


operands in arithmetic operations. 
Exceptions are discussed later in section titled "Exceptions 
and 
Fault Handling." 
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The section "Operations 
on NaNs" provides detailed information 
on how the processor 
handles 
NaNs. 


The processor 
supports three real-number 
data formats: 
real, long real, and extended real. These 
formats orrespond directly to the single-precision, 
double-precision, 
and double-extended 
precision 
formats in the IEEE standard. 
Figure 26 shows these data formats and gives the resolution that each 
provides. 


SIGN 


~~TSIDgfh-qf.£_~Rf~~~_ 


3130 
23 22 
~INTEGER 
0 
IMPLIED 


SIGN 
~iTSffi-qf~-gfT~f{{{1b~ 


6362 
~ 
0 


SIGN 
EXTENDED REAL 
~~Tsmw-~fh-gq~---f~-f{i~_--~ 


7978 
~ 
0 


REAL 
2 -126 TO 2127 (_W45 
TO _1038) 


LONG REAL 
2-1022 TO 21023 (_10-324TO _10308) 


EXTENDED REAL 
2-16382 TO 216383 (_10-4950TO _10+4932) 


For the real and long-real formats, only the fraction is given for the significand. 
The integer is 
assumed to be I for all numbers except 0 and denormalized 
finite numbers. 


For the extended-real 
format, the integer is contained in bit 63, and the most-significant 
fraction bit 
is bit 62. Here, the integer is explicitly 
set to 1 for normalized numbers, infinities, and NaNs, and 
to 0 for zero and denormalized 
numbers. 


Table 18 shows the encodings for all the classes of real numbers (Le., zero, denormalized 
finite, 
normalized 
finite, and 00) and NaNs, for each of the three real data-types. 


Class 
Sign 
Biased 
Exponent 
Integer' 
Fraction 


+00 
0 
11 ... 11 
1 
00 ...00 


0 
11... 10 
1 
11 ... 11 
+ NORMALS 
· 
· 
· 
· 
· 
· 
0 
00 ...01 
1 
00 ...00 
POSITIVE 
0 
00 ...00 
0 
11 ... 11 
+ DENORMALS 
· 
· 
· 
· 
· 
· 
· 
· 
0 
00 ...00 
, 
0 
00 ...01 


+ ZERO 
0 
00 ...00 
0 
00 ...00 
, 


-ZERO 
1 
0 
00 ...00 
00 ...00 


1 
00 ...00 
0 
00 ...01 


- DENORMALS 
· 
· 
· 
· 
· 
· 
· 
1 
00 ...00 
0 
11 ... 11 
NEGATIVE 


- 


1 
00 ...01 
1 
00 ...00 
· 
· 
· 
- NORMALS 
· 
· 
· 
· 
· 
· 
1 
11 ... 10 
1 
11 ... 11 


-00 
1 
11 ... 11 
1 
00 ...00 


SNaN 
X 
11 ... 11 
1 
OX...XX2 


NaN 
QNaN 
X 
11 ... 11 
1 
1X...XX 


REAL: 


LONG 
REAL: 


EXTENDED 
REAL: 


Notes: 


1. Integer is implied for real and long real formats 
and is not stored. 


2. Fraction 
for SNaN must be non-zero. 
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An important feature of the 80960KB processor is that the floating-point 
processing 
capabilities 
have been integrated into the execution environment 
of the processor. 
Operations on floating-point 
numbers are carried out using the same registers that are used for ordinals and integers. In addition, 
four floating-point 
registers have been provided for extended-precision 
floating-point 
arithmetic. 


The following 
sections 
describe 
how floating-point 
operations 
are handled 
in the processor's 
execution environment. 


All of the registers in the processor's 
execution environment, 
(i.e., global, local, and floating point) 
can be used for floating-point 
operations. 
When using global or local registers, real values (i.e., 32 
bits) are contained 
in one register; long-real values (i.e., 64 bits) are contained 
in two successive 
registers; and extended-real 
values (i.e., 80 bits) are contained in three successive registers. 


Figure 27 shows how the three forms of the real data type are encoded when stored in global and local 
registers. Note that long-real values must be aligned on even-numbered 
register boundaries (e.g., gO, 


g2, ...). Extended-real 
values must be aligned on register boundaries that are an integral multiple 
of four (e.g., gO, g4, ...). 


REGISTER 


REAL 
DISPLACEMENT 


31 
2322 
0 
0 
EXPONENT 
I 
FRACTION 
I n 


SIGN 


LONG 
REAL 
31 
20 19 
0 


FRACTION 
(LEAST 
SIGNIFICANT 
BITS) 
n1 
I 
I 
. 
EXPONENT 
FRACTION 
(MOST 
SIGNIFICANT 
BITS) 
n+1 


SIGN 
EXTENDED 
REAL 


31 
16 15 14 
0 


FRACTION 
(LEAST 
SIGNIFICANT 
BITS) 


...• •..........••. 
FRACTION 
(MOST 
SIGNIFICANT 
BITS) 


~ 
I 
EXPONENT 
\ 
I 


~ 
RESERVED 
(INITIALIZED 
TO O) 


NOTES: 


1. REGISTER 
NUMBER 
MUST 
BE EVEN. 


2. 
REGISTER 
NUMBER 
MUST 
BE AN INTEGRAL 
MULTIPLE 
OF FOUR. 
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Real values in the floating-point 
registers are always in the extended-real 
format. 
When a real or 
long-real 
value is moved from global or local registers to floating-point 
register, the processor 
automatically 
reformats it for the extended-real 
format. 


Floating-point 
values are loaded from memory into global or local registers using the load (ld), load 
long (ldl), and load triple (ldt) instructions. Likewise, floating-point values in global or local registers 
are stored in memory using the store (st), store long (stl), and store triple (stt) instructions. 


Loading a floating-point 
value into a floating-point 
register requires two steps (two instructions). 


First, a floating-point 
value must be loaded from memory into one or more global or local registers. 
Then, the value must be moved to the floating-point 
register using a move real (movr), move long- 
real (movrl), 
or move extended-real 
(movre) 
instruction. 
. 


A similar two-step procedure is required to store a value from a floating-point 
register into memory. 
The value must first be moved into one or more global or local registers '(using a movr, movrl, or 
movre instruction), 
then stored in memory. 


This two-step method for moving values from memory into floating-point 
registers and vice versa 


may seem a little cumbersome; 
however, in practice it generally is not. Floating-point 
registers are 
most often used to store and accumulate intermediate results of computations. 
The contents of these 
registers are not normally stored in memory. 


For example, the following instruction 


divr 
r3, 
r4, 
fp2 


causes the real value in local register r4 to be divided by the value in r3, with the extended-real 
result 


stored in floating-point 
register fp2. Here, a move operation from the local registers to the floating- 
point registers is not required, since it is implicit in the divide operation. 


Either the move instructions (mov, movl, or movt) or the move-real instructions (movr, movrl, or 
movre) can be used to move real values among global and local registers. The move real instructions 
are generally used to convert a real value 
from one format to another or for moving real values 


between the global or local registers and floating-point 
registers. 
The move instructions are used to 
move real values while keeping them in the same format. 


When using the movr and movrl instructions to move floating-point 
numbers between the global or 
local registers and the floating-point registers, the processor automatically 
converts values from real 
and long-real format, respectively, into the extended-real 
format and vice versa. 
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For example, the following instruction 


movr 
g3, 
fpl 


causes a 32-bit, real value in global register g3 to be converted to 80-bit, extended-real 
format and 
placed in floating-point 
register fpI. 


Going the opposite direction, the instruction 


movrl 
fpO, 
r4 


causes an extended-real value in floating-point registerfpO to be converted to 64-bit, long-real format 
and placed in local registers r4 and r5. 


The movre 
instruction 
moves 
80-bit, extended-real 
values between 
registers, 
without 
format 
conversion. 
When 
this instruction is used to move a value from three global or local registers to a 
floating-point 
register, the processor extracts the 80-bit value from the three word extended-real 
format. When moving a value from a floating-point register to global or local registers, the processor 
inserts the 80-bit value into the three registers in the three-word format. 


The arithmetic 
controls are used extensively 
to control the arithmetic 
and faulting properties 
of 
floating-point 
operations. 
Table 19 shows the bits in the arithmetic controls that are used in floating- 
point operations. 


The condition code flags are used to indicate the results of comparisons 
of real numbers, just as they 
are for integers and ordinals. 


The arithmetic status field is used to record results from the classify real (classr and classrl) 
and 
remainder real (remr and remrl) instructions. 
These instructions are discussed later in this section. 


The floating-point 
flags indicate exceptions to floating-point 
operations. 
Here, the term exception 
refers to a potentially undesirable operation (such as dividing a number by zero) or an undesirable 
result (such as underflow). 
The flags provide 
a means of recording 
the occurrence 
of specific 
exceptions. 


The floating-point 
masks provide a method of inhibiting the processor from invoking a fault handler 
when an exception is detected. 


Use of the floating-point flag and mask bits are discussed later in this section in "Exceptions and Fault 
Handling." 


Arithmetic 
Function 
Control 
Bits 


0-2 
Condition code 


3-6 
Arithmetic status field 


8 
Integer overtlow flag 


12 
Integer overtlow mask 


16 
Floating overtlow flag 


17 
Floating undertlow flag 


18 
Floating invalid-operation 
flag 


19 
Floating zero-divide 
flag 


20 
Floating inexact flag 


24 
Floating overtlow mask 


25 
Floating undertlow 
mask 


26 
Floating invalid-operation 
mask 


27 
Floating zero-divide 
mask 


28 
Floating inexact mask 


29 
Normalizing 
mode flag 
- ,. 


30 -31 
Rounding control 


The normalizing-mode 
flag specifies whether the processor operates in normalizing 
mode (set) or 
not (clear). 


Normalizing 
mode is the most common mode of operation. 
Here, the processor operates on valid 
floating-point 
operands, regardless of whether they are normalized 
or denormalized 
values. 


When the processor is not operating in normalizing mode, it signals a reserved-encoding 
exception 
whenever it encounters 
a denormalized 
floating-point 
value as a source operand. 
In either mode, 


denormalized 
numbers are be produced if the underflow exception is masked. 


There are no flag or mask bits in the arithmetic controls for this exception. When a reserved-encoding 
exception 
is detected, 
the processor 
generates 
a floating reserved-encoding 
fault and leaves the 
destination operand unchanged (i.e., no result is stored). 


The unnormalized mode of operation is provided to allow unnormalized 
arithmetic to be simulated 
with software. 
Here, a fault handler routine can be used to perform 
unnormalized 
arithmetic 
whenever a reserved-encoding 
exception is signaled. 


Often the infinitely precise result of an arithmetic operation cannot be encoded exactly in the format 
of the destination 
operand. 
For example, 
the following 
value has a 24-bit fraction. 
The least- 
significant 
bit of this fraction (the underlined 
bit) cannot be encoded exactly in the real (32-bit) 
format: 


l.OOI 0000 10000011 
1001 01lE2 
101 


l.OO1 0000 1000 0011 1001 100E2 lO1 


A rounded result is called an inexact result. When an inexact result is produced, the floating-point 
inexact flag bit in the arithmetic controls is set. 


The processor rounds results according to the destination format (real, long real, or extended real) 
and the setting of the rounding-mode 
flags of the arithmetic controls. 
Four types of rounding are 
allowed, as described in Table 20. 


Rounding Mode 
Description 


Round up (toward +00) 
Rounded result is close to but no 
less than the infinitely precise 
result 


Round down (toward -00) 
Rounded result is close to but no 
greater than the infinitely precise 
result 


Round toward zero (Truncate) 
Rounded result is close to but no 
greater in absolute value than the 
infinitely precise result 


Round to nearest (even) 
Rounded result is close to the in- 
finitely precise result. 
If two 
values are equally close, the result 
is the even value (i.e., the one with 
the least-significant 
bit of zero). 


When the infinitely precise result is between the largest positive finite value allowed in a particular 
format and +00, the processor rounds the result as shown in Table 2l. 


Rounding Mode 
Description 


Round up (toward +00) 
+00 


Round down (toward -00) 
Maximum, 
positive finite value 


Round toward zero (Truncate) 
Maximum, 
positive finite value 


Round to nearest (even) 
+00 


When the infinitely precise result is between the largest negative finite value allowed in a particular 
format and _00, the processor rounds the result as shown in Table 22. 


Rounding Mode 
Description 


Round up (toward +00) 
Maximum, negative finite value 


Round down (toward -00) 
-00 


Round toward zero (Truncate) 
Maximum, 
negative finite value 


Round to nearest (even) 
_00 


The rounding modes have no effect on comparison operations, operations that produce exact results, 
or operations that produce NaN results. 


The floating-point 
instructions 
allow a result to be stored in a shorter destination 
than the source 


operands. 
For example, the instruction 


addr 
fpl, 
fp2, 
g5 


produces 
a real (32-bit) 
result from two extended-real 
(80-bit) 
source operands. 
In all such 
operations, only one rounding error occurs: the error that occurs when rounding the infinitely precise 
result to the size of the destination format. 


Technically, an operation which computes a narrow result from wide operands is in violatio'n of the 
IEEE standard. 
However, systems that are designed to conform to the IEEE standard do not need 


to use this capability of the processor. 


The instruction 
format 
for floating-point 
instructions 
is the same as for the other processor 
instructions. 
When 
programming 
in assembly language, 
an assembly language statement begins 
with an instruction 
mnemonic 
and is followed by from one to three operands. 
For example, the 


multiply-real 
instruction mulr might be used as follows: 


mulr 
r8, 
r9, 
fp3 


Here, real operands in local registers r8 and r9 are multiplied together 
and the result is stored in 
floating-point 
register fp3. 


From the machine level point of view, all floating-point 
instructions use the REG format. Refer to 
Appendix B for details on the REG format instructions. 


Operands 
for floating-point 
instructions 
can be either floating-point 
literals or registers. 
The 
processor recognizes two encodings for floating-point 
literals: 
+0.0 and + 1.0. 


All of the registers in the processor's 
execution environment 
(global registers gOthrough gl5,local 
registers rO through d5, 
and floating-point 
registers fpO through fp3) can be used as operands in 
floating-point 
instructions. 
(Of course, registers gl5, rO,d, and r2 would generally not be used for 
storing floating-point 
numbers, since they are reserved for stack management 
functions.) 


When global or local registers are specified as operands, 
the instruction 
mnemonic 
(or opcode) 
determines 
how the values in these registers are interpreted. 
For example, there are two floating- 
point divide instructions: 
divide real (divr) 
and divide long real (divrl). 
When using the divr 
instruction, the processor assumes that global- or local-register operands contain real (32-bit) values. 
When using the divrl instruction, global- or local-register operands are assumed to contain long-real 
(M-bit) values. 


With either instruction, floating-point 
registers (containing extended-real 
values) can also be used 
as operands. 


Using floating-point 
registers as operands allows mixed format or mixed precision arithmetic to be 
performed with either real and extended-real 
values or long-real and extended-real 
values. Mixed- 
format operations with real and long-real values are not supported. 


The processor's 
floating-point 
instructions consist of all instructions for which as least one operand 
is a real data type. 


Data Movement 


Data Type Conversion 


Basic Arithmetic 


Comparison 
and Classification 


Trigonometric 


Logarithmic 
and Exponential 


The following sections give a brief overview ofthe instructions in each group. Detailed descriptions 
of the operations of these instructions are given in Section 10. 


As has been described earlier in this section, the non-floating-point 
load and store instructions 
are 
used to move real values between registers and memory. 
Once in registers, the non-floating-point 
move instructions 
(mov, movl, and movt) are used to move real values between global and local 
registers without format conversion; 
whereas, the floating-point 
move instructions 
(movr, movrl, 
and movre) 
are used to move real values between 
global and local registers and floating-point 
registers. 


The copy-sign-real 
extended (cpysre) 
and copy-reverse-sign 
real-extended 
(cpyrsre) 
instructions 
provide a means of copying the sign of one extended-real 
value to another, if one of the values is in 
a floating-point 
register. This operation is best performed on real and long-real values using the bit 
instructions 
chkbit 
and alterbit. 


Two types of data type conversions 
are provided: 
conversion 
from one floating-point 
format to 
another (e.g., real to extended real) and conversion between integer and real. 


Conversion 
between floating-point 
formats is handled in either of two ways: 
explicitly by move 
instructions 
or implicitly by using the floating-point 
registers as operands in instructions. 


As described 
earlier in this section, the movr instruction 
implicitly converts values from real to 
extended real, and vice versa, when moving values between global or local registers and floating- 
point registers. Likewise, the movrl instruction implicitly converts values from long real to extended 
real, and vice versa. 


Conversion between real and long-real formats requires the use of both instructions. 
For example, 
the following two instructions convert a real value in global register g6 to a long-real value contained 
in g6 and g7, using a floating-point 
register for intermediate 
storage of the value: 


movr 
g6, 
fpl 


movrl 
fpl, 
g6 


Implicit format conversion is also provided through the arithmetic, trigonometric, 
logarithmic, and 
exponential 
instructions. 
For example, the instruction 


addr 
r4, 
r5, 
fp2 


adds two real values together and produces an extended-real 
result. 
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cvtir 


cvtilr 


cvtri 


cvtril 


cvtzri 


cvtzril 


convert integer to real 


convert long integer to long real 


convert real to integer 


convert real to long integer 


convert truncated real to integer 


convert truncated real to long integer 


Both the cvtir and cvtilr instructions can be used to convert an integer to an extended-real 
value by 
specifying that the result be placed in a floating-point 
register. 


The convert real-to-integer 
instructions rouI1doff the real value to the nearest integer or long-integer 
value. 
For the cvtri and cvtril instructions, 
the rounding mode 
determines 
the direction the real 
number is rounded. 
For the convert truncated 
real-to-integer 
instructions 
(cvtzri and cvtzril), 
rounding 
is always 
toward 
zero. 
The latter two instructions 
are provided 
to allow efficient 
implementation 
of FORTRAN-like 
truncation semantics. 


Extended-real 
values can be converted 
to integers 
by using a floating-point 
register as a source 
operand in either of the convert real-to-integer 
instructions. 


Converting 
long-real values to integers requires two instructions, 
as in the following example: 


movrl 
g6, 
fp3 


cvtzri 
fp3, 
g6 


The first instruction moves the long-real value to a floating-point 
register. The second instruction 
converts the extended-real 
value.to an integer. 


addr 


addrl 


subr 


subrl 


mulr 


mulrl 


divr 


divrl 


add real 


add long real 


subtact real 


subtract long real 


multiply real 


multiply long real 


divide real 


divide long real 


remainder real 
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remrl 


rouodr 


rouodrl 


sqrtr 


sqrtrl 


remainder long real 


round real 


round long real 


square root real 


square root long real 


The round instructions round the floating-point 
operand to its nearest integral (i.e., integer) value, 


based on the current rounding mode. 
These instructions perform a function similar to the convert 
real-to-integer 
instructions except that the result is in floating-point 
format. 


Comparison 
of floating-point 
values differs from comparison 
of integers or ordinals because with 
floating-point 
values there are four, rather than the usual three, mutually exclusive relationships: 
less 
than, equal to, greater than, and unordered. 


The unordered relationship is true when at least one of the two values being compared 
is a NaN. This 
additional relationship 
is required because, by definition, NaN s are not numbers, so they cannot have 
greater than, equal, or less than relationships 
with other floating-point 
values. 


cmpr 


cmprl 


cmpor 


cmporl 


compare real 


compare long real 


compare ordered real 


compare ordered long real 


All of these instructions set the condition code flags in the arithmetic controls to indicate the results 
of the comparison. 
With the compare instructions 
(cmpr and cmprl), 
the condition code flags are 
set to 0002 for the unordered condition. With the compare ordered instructions (cmpor and cmporl), 
the condition 
code flags are set to 0002 and an invalid-operation 
exception 
is signaled 
for the 
unordered condition. 


Two branch instructions 
(bo and boo) allow conditional branching to be performed on an ordered 
or unordered condition, respectively. 
With these instructions, 
the processor checks thel condition 
code flags for unordered (000) or ordered (111) and branches accordingly. 


The classify-real 
instructions 
(c1assr and c1assrl) provide a means of determining 
the class of a 
floating-point 
value (i.e., zero, denormalized 
finite, normalized 
finite, 
00, SNaN, or QNaN). The 
result of this operation is stored in the arithmetic status field of the arithmetic controls. 
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sin 


sinrl 
cosr 
cosrl 


tanr 


tanrl 


atanr 


atanrl 


sine real 


sine long real 


cosine real 


cosine long real 


tangent real 


tangent long real 


arctangent real 


arctangent long real 


f=C90FDAAA2 
2168C234 
CI6 
e=2 if significand 
is O. 


This value has a 66-bit mantissa, 
which is 2 bits more than is allowed in the significand 
of an 
extended-real 
value. 
(Since 66 bits is not an even number of hex digits, two additional zeros have 
been added to the value so that it can be represented in a hexadecimal format. The least-significant 
hex digit (CI6) 
is thus 11002, where the two least 
significant bits represent bits 67 and 68 of the 
mantissa.) 


If the results of computations 
that explicitly 
use 1t are to be used in the sine, cosine, or tangent 
instructions, the full 66-bit fraction for 1t should be used. This insures that the results are consistent 
with the argument reduction algorithms that these instructions use. Using a rounded version of 1t can 
cause inaccuracies 
in result values, which if propagated through severa 1 calculations, 
might result 
in meaningless 
results. 


A common method of representing 
the full 66-bit fraction of 1t is to separate the value into two 
numbers. For example, the following two long-real values added together give the value forn shown 
above with the full 66-bit fraction: 


highern=400921FB 
5440000016 


IOWIt=3DDOB461 1A60000016 


Here high It gives the most significant 33 bits of It and low It gives the least significant 33 bits. Similar 
versions of It can also be written in the extended-real 
format. 


When using this two-part It value in an algorithm, parallel computations should be performed on each 
part, with the results kept separate. 
When all the computations 
are complete, the two results can be 
added together to form the final result. 


The following instructions provide three different logarithmic 
functions, an exponential 
function, 


and a scale function: 


logbnr 


logbnrl 


logr 


logrl 


logepr 


logeprl 


expr 


exprl 


scaler 
scalerl 


log binary real 


log binary long real 


log real 


log long real 


log epsilon real 


log epsilon long real 


exponent real 


exponent long real 


scale real 


scale long real 


These instructions are described in detail in Section 10. The following is a brief description of their 
functions. 


The log binary instructions 
compute the IEEE recommended 
function 
10gb (X). The result is an 
integral value that is the binary log of X. 


The log epsilon instructions compute the function Y * log (X + 1), where the log of X + 1 is a base- 
2 logarithm. 
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The floating-point 
instructions 
can be 
divided into two groups: 
arithmetic 
and nonarithmetic. 
Arithmetic 
instructions 
are those that are sensitive to real values, meaning 
that they distinguish 
among NaN, 
00, normalized 
finite, denormalized 
finite, and zero values. 


All but five of the floating-point 
instructions are arithmetic. The five nonarithmetic 
instructions are 
move-real extended 
(movre), copy-sign real extended (cpysre), copy-reversed-sign 
real extended 
(cpyrsre), 
and classify real (c1assr and c1assrl). These nonarithmetic 
instructions are insensitive to 
real values and cannot generate floating-point 
exceptions or faults. 


This distinction between arithmetic 
and nonarithmetic 
instructions 
is important because floating- 
point exceptions 
and faults can be signaled only during the execution of arithmetic instructions. 


As was described earlier in this section, the processor 
supports two types of NaNs: 
QNaN andl 
SNaN. An SNaN is any NaN value with its most-significant 
fraction bit set to 0 and at least one other 
fraction bit set to 1. (If all the fraction bits are set to 0, the value is an 00.) A QNaN is any NaN value 
with the most-significant 
fraction bit set to I. The sign bit of a NaN is not interpreted. 


In general, when a QNaN is used in one or more arithmetic floating-point 
instructions, 
it is allowed 
to propagate through a computation. 
An SN aN on the other hand causes a floating invalid-operation 
exception to be signaled. 


The floating invalid-operation 
exception has a flag and a mask bit associated with it in the arithmetic 
controls. The mask bit determines how the processor handles an SNaN value. If the floating invalid- 
operation mask bit is set, the SNaN is converted to a QNaN by setting the most significant fraction 
bit of the value to a O. The result is then stored in the destination and the floating invalid-operation 
flag is set. If the invalid operation mask is clear, a floating invalid-operation 
fault is signaled and no 
result is stored in the destination. 


When the result is a QNaN, the format of the result is as shown in Table 23, depending on the form 
of the source operands. 


In some cases, a QNaN result is returned when none of the source operands are NaNs. 
Here, a 
standard QNaN is returned. 
The significand for the standard QNaN is as follows: 
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Source Operands 
QNaN Result 


Only one operand is NaN, destina- 
QNaN version of NaN source 
tion is same width 


Only one operand is NaN, destina- 
QNaN version of NaN source, with 
tion is longer 
fraction extended with zeros 


Only one operand is NaN, destina- 
QNaN version of NaN source, with 
tion is shorter 
fraction truncated 


Both operands are NaNs 
QNaN version of source whose 
fraction field has greatest mag- 
nitude, with fraction extended or 
truncated as described above 


Occasionally, 
a floating-point 
instruction can result in an exception being signaled. 
The processor 
recognizes six floating-point 
exceptions: 


Floating Reserved Encoding 


Floating Invalid Operation 


Floating Zero Divide 


Floating Overflow 


Floating Underflow 


Floating Inexact 


1. 
Situations in which one or more source operands are inappropriate 
for an operation and would 
cause an exception to be signaled. 


2. 
Situations in which the result of an operation is exceptional. 


The reserved encoding, invalid operation, and division-by-zero 
exceptions fall in the first category; 


the overflow, underflow, and inexact exceptions fall in the second category. 


Except for the floating reserved-encoding 
exception, each of these exceptions has a flag and a mask 
bit associated with it in the arithmetic controls. 
When an exception condition occurs, the processor 
performs one of the following operations: 


If the mask bit for the exception is set, the flag for the exception is set and instruction execution 
continues, substituting a default value in place of the result. 


If the mask bit for the exception is clear, the flag for the exception is not set and a floating -point 
arithmetic 
fault is raised. 
The processor 
then stores 
diagnostic 
information 
in the fault 
information 
area and diverts instruction execution to a fault handler. 


Since the floating 
reserved-encoding 
exception does not have a flag or mask bit, it always results 
in a fault. 


Note 


The floating-point 
exception 
flags are "sticky," 
which means that the processor 
does not implicitly 
clear 
them while carrying 
out floating-point 
operations. 
They may be cleared by software. 


As is described in Section 9, when a floating-point 
fault is signaled, the processor calls a single fault 
handler. 
This fault handler determines how to handle the specific fault subtype by interpreting 
the 
floating-point 
exception flags and the information 
in the fault record. 


When a reserved encoding is used as an operand in a floating-point 
instruction, 
or 


When a denormalized 
value is used as an operand 
in a floating-point 
instruction 
and the 
normalizing-mode 
bit in the arithmetic controls is clear. 


The first condition is rare. 
It can only occur if a program presents an extended-real 
value to the 
processor that has a zero j-bit (integer part) and a non-zero biased exponent. 


The second condition was discussed earlier in the section titled "Normalizing 
Mode." This condition 
is also rare, since the vast majority of programs run with the normalizing 
mode enabled. 


There is neither a flag nor a mask bit for this exception. When a reserved-encoding 
exception occurs, 
the processor raises a floating reserved-encoding 
fault and does not store a result. 


The invalid-operation 
exception indicates that one of the source operands is inappropriate for the type 
of operation being performed. 
The following conditions cause this exception to be signaled: 


Any arithmetic operation on an SNaN 


Addition of infinities of unlike sign 


Subtraction 
of infinities of like sign 


Multiplication 
of zero by 
00 


Division of zero by zero or 00 by 00 


Remainder of x by y, if y is zero or x is 00 


Square root of a negative, nonzero value 


Conversion 
of a NaN from floating-point 
format to integer format 


Sine, cosine, or tangent of 
00 


y * log (x), if: 


x is negative and nonzero, 


y is zero and x is 
00 


y and x are zero, or 


y is 00 and x is 1 


Log epsilon of (y, x), if y is 
00 and x is 0 


Compare ordered, if a source operand is a NaN 


When the result is a floating-point 
value, the standard QNaN value is stored in the destination 
and the floating invalid-operation 
flag is set. (A discussion of how the processor handles NaNs 
was provided earlier in the section titled "Operations 
on NaNs.") 


When the result is an integer, the maximum negative integer is stored in the destination and the 
floating invalid-operation 
flag is set. 


When the mask is clear, no result is stored; the floating invalid-operation 
flag is not set; and the 
floating invalid-operation 
fault is signaled. 


The floating zero-divide 
exception is signaled when an exact non-finite result would be produced 
from finite operands. 
(Note that a different exception, overflow, is signaled when an infinite result 
is produced 
inexactly from finite operands.) 
The most common example of this exception 
is a 
division operation, where the divisor is zero and the dividend is a nonzero, finite value. 


When the floating zero-divide mask is set: a correctly signed 
00 is stored in the destination 
and the 
floating zero-divide flag is set. When the mask is clear, no result is stored; the floating zero-divide 
flag is not set; and a floating zero-divide fault is signaled. 


The overflow exception 
occurs when the infinitely precise result of a floating-point 
instruction 
exceeds the largest allowable 
finite value for the specified destination format. 
For example, if the 
destination 
format is real (32 bits), overflow occurs when the infinitely precise result falls outside 
the range -1.0 * 2126 to 1.0 * 2126 (exclusive), 
where 126 is the unbiased exponent of the result. 


When the floating overflow mask is set, a rounded result is stored in the destination and the floating 
overflow flag is set. The current rounding mode determines 
the method used to round the result. 


When the mask is clear: no result is stored in the destination and the floating overflow flag is not set. 
Instead, the processor 
stores the result in extended-real 
format in the fault information 
area. The 
fraction of the extended-real value is rounded to the instruction's 
destination precision. For example, 
if the destination operand's format is real (32 bits), the extended-real 
fraction is rounded to 23 bits, 


with the 40 least-significant 
bits filled with zeros. 


If the exponent exceeds the range of the extended-real 
format 
(16383 unbiased), then the exponent 
is divided by 224576and a flag (bit 1 of the fault flags byte or override flags byte) is set in the fault 
information 
area to indicate 
that the exponent has been bias adjusted. 
After this fault information 
is stored, a floating overflow fault is signaled. 


When using the scale instructions 
(scaler 
or scalerl), 
massive 
overflow 
can occur, where the 
infinitely precise result is too large to be represented, 
even with a bias adjusted exponent. Here, a 


properly signed 
00 is stored in the fault record. 


The floating overflow exception cannot occur on a conversion from floating-point 
format to integer 
format (although an integer overflow exception can occur). 


An underflow condition occurs when the infinitely precise result of a floating-point 
instruction is less 


than the smallest possible normalized, finite value for the specified destination format. For example, 
for the real (32-bit) format, underflow occurs when an infinitely precise result falls in the range -1.0 
* 2126to 1.0 * 2126(exclusive), 
where -126 is the unbiased exponent. 


When a floating underflow condition occurs, the setting of the floating underflow mask determines 
how the processor handles the condition. 


If the mask is set when an underflow condition occurs, the processor goes ahead and denormalizes 
the result. 
Then if the result is exact, it is stored in the destination 
and the floating underflow 


exception 
is not signaled, 
nor is the floating 
underflow 
flag set. 
If, on the other hand, the 


denormalized 
result is inexact, the floating underflow 
flag is set and the processor goes on to handle 


the inexact condition as described in the next section. 


If the floating underflow mask is clear when an underflow-condition 
occurs, no result is stored in the 


destination 
and the floating underflow 
flag is not set. 
Instead, the processor 
stores the result in 
extended-real 
format in the fault information 
area, with the fraction of the extended-real 
value 


rounded to the instruction's 
destination precision. 
For example, if the destination precision is real 
(23-bit fraction) the 40 least-significant 
bits of the fraction are set to O. 


If the exponent of the value stored is less than the minimum allowable value in the extended-real 
format (-16,382 unbiased), then the exponent is multiplied by 224576and a flag (bit 1 of the fault or 
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override flags byte) is set in the fault information 
area to indicate that the exponent has been bias 


adjusted. 
After this information 
is stored, a floating underflow fault is signaled. 


The scale instructions can cause massive underflow to occur, where the infinitely precise result is too 
small to be represented, 
even with a bias adjusted exponent. 
Here, a properly signed zero is stored 


in the fault record. 


Refer to the section later titled "Floating-Point 
Underflow Condition" 
for more information on the 


interaction of the floating underflow and inexact exceptions. 


The floating inexact exception occurs when an infinitely precise result cannot be encoded in the 
format specified for the destination 
operand. Either of the following two conditions can cause an 
inexact exception to be signaled: 


When a result is rounded and the result is not exact 


When overflow occurs and the floating overflow mask is set 


If the floating inexact mask is set when an inexact condition occurs and an unmasked overflow or 
underflow condition does not occur, the rounded result is stored in the destination and the floating- 
point inexact flag is set. The current rounding mode determines the method used to round the result. 


If the floating inexact mask is clear when an inexact condition occurs, the floating inexact flag is not 
set and one of the following operations is carried out: 


If only the inexact condition has occurred, the processor stores the rounded result in the specified 
destination, 
then raises a floating-inexact 
fault. 


If the inexact condition 
occurs along with overflow or underflow, no result is stored in the 
destination. 
Instead, 
the processor 
stores the result in extended-real 
format 
in the fault 
information area, as described for the floating overflow and underflow exceptions, then I raises 
a floating inexact fault. 


Refer to the following section for more information on the interaction of the floating underflow and 
inexact exceptions. 


Two aspects of underflow are important in numeric processings: 
the "tininess" 
of a number and "loss 


of accuracy." 
A result is tiny when it is nonzero and its exponent is between +2Emin, where Emin is 


the smallest unbiased exponent allowed in the destination format. 
For example, if the destination 
format is long-real (64-bit format), a result is tiny if it is nonzero and in the range of +I * 21022 to - 
I * 221022. The ability to detect a tiny result is important because such a result may cause an exception 
to be signaled in a later operation (e.g., overflow on a division). 
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Loss of accuracy occurs when a tiny result is approximated 
as part of the denormalization 
process 
so that it will fit into the destination format. 


In the 80960KB processor, tininess is detected after rounding as an underflow condition. Loss of 
accuracy is detected as an inexact condition. 


The algorithm 
in Figure 27 shows how the processor 
responds to these two conditions, 
when a 
floating-point 
operation produces a tiny result. 


An important 
point to note in this algorithm 
is that if the underflow 
mask is set, an underflow 
exception is signaled only if the denormalized result is inexact. If the denormalized 
number is exact, 


no flags are set and no faults are signaled. 


generate infinitely precise result # exponent and significand; 
if exponent < underflow threshold 
then 
if underflow fault mask clear 
then 
goto underflow fault handler; 
exit algorithm; 


else generate denormalized 
number 
if denormalized 
significand equals infinitely precise significand 
then 
store denormalized 
result in destination; 


# no underflow is signaled; 
else 
set underflow flag in AC; 
if inexact fault mask is clear 
then 
goto inexact fault handler; 
exit algorithm; 


else 
set inexact flag in AC; 
store denormalized 
result in destination; 
end if; 


end if; 
end if; 
else 
if infinitely precise result is inexact 
then 
if inexact fault mask is clear 
then 
goto inexact fault handler; 
exit algorithm; 
else 
set inexact flag in AC; 
store normalized 
result in destination; 
end if; 
else 
store normalized result in destination; 
end if; 
end if; 
exit algorithm 
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This section describes 
the interagent communication 
(lAC) mechanism of the 80960KB processor. 


Included 
is a description 
of the lAC message structure, the lAC message sending 
and receiving 
mechanism, 
and reference information 
on the available lAC messages. 


Note 


The 80960KB 
processor's 
interagent 
communication 
mechanism 
is an extension 
to the architecture 
and 
may not be supported 
in other processors 
based on this architecture. 


The lAC facilities provide a mechanism for agents connected to the processor's 
bus to communicate 
with the processor by means of messages. The agents that use these facilities may be other 80960KB 
processors, 
I/O processors, 
or special purpose 
hardware. 
Programs 
running 
on the 80960KB 
processor can also use this message-passing 
mechanism to send messages internally to the processor. 


The primary function of these facilities is to provide an alternative to the interrupt mechanism 
for 
external 
hardware 
to communicate 
with the processor. 
Also, certain 
processor 
functions 
like 
reinitialization, 
purging the instruction cache, and setting breakpoint registers can only be carried out 
with this mechanism. 


lAC messages (referred to here as lACs) are four words in length and are exchanged by means of 
message buffers that are mapped to memory. 
All the usable lACs are predefined. 
The processor 
handles an lAC in much the same way as it handles an instruction. 


The processor provides two mechanisms for exchanging lACs: external and internal. The external 
lAC mechanism 
is used to pass lACs between two agents on the processor's 
bus. A processor uses 


the internal lAC mechanism 
to pass an lAC to itself. 


Figure 28 shows the format for an lAC message. 
Each message consists of a message-type 
field and 
up to five parameter fields. 


MESSAGE 
TYPE 
I 
FIELD 
1 
I 
FIELD 
2 


FIELD 
3 


FIELD 
4 


FIELD 
5 
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The parameters can be 8, 16, or 32-bits in length, depending on the specified field. Many of the lACs 
do not require parameters. 
When a message type does require one or more parameters, the processor 
only looks at the required parameter fields. 
Those fields not used are ignored. 


No special software, such as dedicated data structures or stacks, are required to handle lACs. An lAC 
is sent with a quad synchronous move instruction (synmovq). 
When the processor receives an lAC, 


it handles it independently 
from the program execution environment. 
It does not use the instruction 
execution unit, the registers (global or local), the stack, or memory. Thus, the state of the processor 
when the lAC is received does not need to be saved. 


Some lACs, such as the purge instruction 
cache lAC, do not affect the processor's 
state. 
The 
processor treats these lACs as if they were an instruction inserted in the control flow ofthe process. 
When the lAC action is complete, the processor resumes work on the program it is currently running. 


Other lACs, such as the reinitialize processor lAC, cause the state of the processor or the control of 
the currently running program to be permanently changed. In these instances, the processor resumes 
activity in its new processor state, following the execution of the lAC. 


All lACs are assumed to have a priority of 31, so the processor executes the action requested in the 
lAC message immediately, 
even if the processor's 
current priority is 31. While the processor 
is 
handling an lAC, it will not respond to interrupts signaled on the interrupt pins. 


Internal lACs are used for functions 
such as setting breakpoint 
registers, purging the instruction 
cache, or sending software initiated interrupts. 


1. 
Load the message into four consecutive words in memory, with the first word aligned on a word 
boundary. 


2. 
Execute a synmovq 
instruction to move the message from its source address to destination 11 
address FFOOOOlO16• 


When the destination operand of al synmovq instruction is FFOOOOlO16,the processor interprets the 
instruction as a send internal-lAC 
instruction. 
The processor then receives the lAC by moving the 
message from memory into an internal message buffer. 


The action of the synmovq move instruction insures that the loading of the message into the processor 
is completed before the processor is allowed to perform any other chores. 


Note 


The address 
range of FFOOOOOOIOthrough 
FFFFFFFFIO 
is reserved 
for interrupt 
handling 
and lAC 
message 
passing. 


External lACs are used by agents external to the processor to initiate processor actions such as testing 
for pending interrupts or freezing the processor. 
External lACs can be sent between two 80960KB 
processors that are connected to the same bus or by external logic that duplicates the external lAC 
sending mechanism. 
The following sections describe how one processor sends an lAC to another 
processor. 
The 80960KB Hardware Designer's Reference Manual describes the requirements 
that 
external logic must meet to perform these same functions. 


Sending an external lAC message is similar to sending an internal lAC message, except that the 
address of the receiving agent is specified in a slightly different way. Figure 29 shows the required 
encoding of the address for the receiving agent. 


31 
24 23 
14 13 
9 
8 
4 
3 
0 
EEEEEEEEJ================== 
t 
~ORITY 
----------------ADDRESS 
OFlAC 
RECIPIENT 


270647-28 


At initialization 
each agent on the bus is assigned a unique address in the range of FFOOOCOOl6to 
FFFFCCOOI6" To send an lAC to an agent, the sending agent sends the message to the address assigned 
to the receiving agent. As shown in Figure 29, only bits 14 through 23 of this address are interpreted 
to determine the address of the receiving agent. Bits 4 through 8 of this address are used to encode 
the priority of the message. 


For example, to send a priority 2510 lAC to the agent at address 00000000012, 
the message address 
would be FF004D9016• 


To send an external lAC from one 80960KB 
processor 
to another, software must perform 
the 
following steps: 


1. 
Load the message into four consecutive words in memory, with the first word aligned on a word 
boundary. 


2. 
Execute a synmovq instruction to move the message from its source address to the address of 
the receiving agent (encoded in the form shown in Figure 29). 


3. 
Check the condition code in the arithmetic controls to determine if the message was received 
(0102) or rejected (0002), 


The action of the synmovq move instruction insures that the sending processor does not execute any 
other instructions until the synmovq 
instruction is complete. 
It also sets the condition code bits to 
indicate whether or not the move was successful. 
A successful move is interpreted as the lAC being 
received by the processor. 


A processor receives and handles an external lAC in somewhat the same manner as it receives and 
handles an interrupt. To configure a processor to receive external lACs, vector INTO ofthe interrupt- 
control register (shown in Figure 19) is set to O. The INTO pin on the processor chip then becomes 
the lAC pin. 
(Refer to Section 7, "Interrupts 
From Interrupt Pins" for further discussion 
of the 
interrupt pins and interrupt-control 
register.) 


When the processor receives a signal on the lAC pin, it handles it initially as if it were receiving an 
interrupt. It reads the vector number associated with this pin (bits 0 through 7 of the interrupt-control 
register). If it is zero, the processor recognizes that it is receiving an external lAC. It then reads the 
four-word lAC message from the bus and performs the requested lAC. 


The processor acts immediately on any lAC that it receives. For efficient system operation, external 
logic must thus be provided to insure that low priority lAC messages do not interrupt the processor 
while it is handling a higher priority task. The handshaking for this operation is provided by the write- 
external-priority 
mechanism 
described in Section 6. 


Using 
the write-external-priority 
mechanism, 
the processor 
keeps 
the external 
logic updated 
regarding the processor's 
current priority. When an lAC is sent to the processor, the external logic 
intercepts it and reads the priority. 
The external logic then determines whether the lAC priority is 


above that of the 
processor or not. 
If the lAC has a higher priority, the external logic sends an 
acknowledge 
signal to the sending processor, then signals the receiving processor by asserting the 
lAC pin. If the lAC has an equal or lower priority, the external logic sends a non-acknowledge 
signal 
to the sending processor. 


The sending processor uses the acknowledge 
or non-acknowledge 
signals to set the condition codes 
to complete the synmovq 
instruction. 


While the processor is servicing an lAC, it performs some handshaking with the external logic so that 
the logic knows when the processor has finished work on an lAC. The external logic is then able to 
reject any lAC that it receives while the processor is servicing another lAC. 
. 


Table 24 gives a list of the lAC messages that the processor can send either internally or externally. 
The following section provides detailed reference information on these messages. 


intJ 


Interrupt Handling 
Processor Management 
Interrupt 
Purge Instruction Cache 
Test Pending Interrupt 
Set Breakpoint 
Register 
Store System Base 
Freeze 
Continue Initialization 
Reinitialize 
Processor 


The following 
section provides detailed descriptions 
of the operations carried out for each of the 
lACs. 
This section is organized alphabetically 
by lAC title for easy reference. 
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Message Type: 


Function: 


9216 
Carries out the initialization 
procedure 
that follows 
the processor 
self test. 
The processor 
executes the initialization 
procedure 
begin- 
ning with reading the initial memory 
image from ROM. 
The self 
test is not performed. 


Refer to the section in Chapter 7 titled "Processor Initialization" 
for 
further details on the initialization 
process. 


Message Type: 


Function: 


9116 


Stops the processor. 
The processor puts itself in the stopped state. 
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Message Type: 


Parameters: 
4°16 
Field 1 


Fields 2 - 5 


Interrupt vector 


Not Used 


Generates an interrupt request. 
The interrupt vector is given in field 
1 of the lAC message. 
The processor handles the interrupt request 


just as it does interrupts 
received from other sources. 
If the inter- 


rupt priority 
is higher 
than 
the processor's 
current 
priority, 
the 


processor 
services the interrupt request immediately. 
Otherwise, 
it 
posts the interrupt in the pending interrupts 
section of the interrupt 
table. 


Refer to Chapter 8 for further information 
on the servicing of inter- 
rupt lACs. 
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Message Type: 


Function: 


8916 
Invalidates all entries in the processor's 
internal instruction cache. 
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Message 
Type: 


Parameters: 


9316 
Fields 1 - 2 


Field-3 


Field-4 


Field 5 


Not Used 


Address of System Address Table 


Address of Processor Control Block 


Start Instruction IP 


Reestablishes 
the processor state. 
In reinitializing 
itself, the proces- 


sor first locates the system address table and the processor 
control 


block in the IMI from the addresses given in fields 3 and 4. 


The processor 
then begins executing 
the instruction 
list beginning 


with the IP given in field 5. 


inter 


Message Type: 


Parameters: 


8F16 
Fields 1 - 2 


Field 3 


Field 4 


Field 5 


Not Used 


Breakpoint IP 


Breakpoint IP 


Not Used 


Enables or disables two breakpoints. 
When the processor 
receives 
this lAC, it conditionally 
loads the parameters 
from fields 3 and 4 
into" breakpoint 
registers 0 and 1, respectively. 
Field 3 provides 
a 
breakpoint lP for breakpoint register 0, and field 4 provides a break- 
point lP for breakpoint 
register 
1. Bit 1 in each of these fields is a 
breakpoint disable flag. 


If the disable flag in one of these fields is set, the breakpoint 
for the 
corresponding 
breakpoint 
register 
is disabled. 
Otherwise, 
the lP 
value 
in the 
field 
is loaded 
into 
the 
corresponding 
breakpoint 
register and the breakpoint is enabled. 


Breakpoints 
are 
described 
in the 
section 
in Chapter 
10 titled 
"Breakpoint-Trace 
Mode." 
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Message Type: 


Parameters: 
8°16 
Fields I - 2 


Field 3 


Fields 4 - 5 


Not Used 


Destination 
Address 


Not Used 


Stores 
the current 
locations 
of the system 
address 
table 
and the 
PRCB in a specified location in memory. 
The address of the system 
address table is stored in the word starting at the byte specified 
in 
field 3, and the address of the PRCB is stored in the next word in 
memory (field 3 address plus 4). 
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Message 
Type: 


Function: 


4116 
Tests for pending 
interrupts. 
The processor 
checks 
the pending 
interrupt section of the interrupt table for a pending interrupt with a 
priority 
higher 
than the processor's 
current 
priority. 
If a higher 
priority interrupt is found, it is serviced immediately. 
Otherwise, 
no 
action is taken. 


APPENDIX A 
INSTRUCTION 
AND DATA STRUCTURE 
QUICK REFERENCE 


This section provides 
two lists of 80960KB 
instructions: 
one sorted by assembly-language 


mnemonic 
and another sorted by machine-level 
opcode. 
In these lists, each entry includes the 
assembly-language 
mnemonic for an instruction; 
the operands (given in the required order); the 
machine-level 
opcode and instruction 
type (i.e., REG, MEM, COBR, CTRL); 
and the page 
number in Chapter 11 where the detailed description 
of the instruction is given. 


Instruction 
List by Assembler Mnemonic 


Mnemonic 
Operands 
Opcode 
Inst. Type 
Page 


addc 
srcl, 
src2, 
dst 
5BO 
REG 
11-6 
addi 
srcl, 
src2, 
dst 
591 
REG 
11-7 
addo 
srcl, 
src2, 
dst 
590 
REG 
11-7 
addr 
srcl, 
src2, 
dst 
78F 
REG 
11-8 
addrl 
src1 , 
src2, 
dst 
79F 
REG 
11-8 
alterbit 
bitpos, 
src, 
dst 
58F 
REG 
11-10 
and 
srcl, 
src2, 
dst 
581 
REG 
11-11 
andnot 
src1 , 
src2, 
dst 
582 
REG 
11-11 
atadd 
src/dst, 
src, 
dst 
612 
REG 
11-12 
atanr 
srcl, 
src2, 
dst 
680 
REG 
11-13 
atanrl 
srcl, 
src2, 
dst 
690 
REG 
11-13 
atmod 
src, 
mask, 
src/dst 
610 
REG 
11-15 
b 
targ 
08 
CTRL 
11-18 
bal 
targ 
OB 
CTRL 
11-16 
balx 
targ, 
dst 
85 
MEM 
11-16 
bbc 
bitpos, 
src, 
targ 
30 
COBR 
11-20 
bbs 
bitpos, 
src, 
targ 
37 
COBR 
11-20 
be 
targ 
12 
CTRL 
11-22 
bg 
targ 
11 
CTRL 
11-22 
bge 
targ 
13 
CTRL 
11-22 
bl 
targ 
14 
CTRL 
11-22 
ble 
targ 
16 
CTRL 
11-22 
bne 
targ 
15 
CTRL 
11-22 
bno 
targ 
10 
CTRL 
11-22 
bo 
targ 
11 
CTRL 
11-22 
bx 
targ 
84 
MEM 
11-18 
call 
targ 
09 
CTRL 
11-25 
calls 
targ 
660 
REG 
11-27 
callx 
targ 
86 
MEM 
11-29 
chkbit 
bitpos, 
src 
5AE 
REG 
11-31 
classr 
src 
68F 
REG 
11-32 
classrl 
src 
69F 
REG 
11-32 
clrbit 
bitpos, 
src, 
dst 
58C 
REG 
11-34 
cmpdeci 
srcl, 
src2, 
dst 
5A7 
REG 
11-36 
cmpdeco 
srcl, 
src2, 
dst 
5A6 
REG 
11-36 
cmpi 
src1 , 
src2 
5Al 
REG 
11-35 
cmpibe 
src1 , 
src2, 
targ 
3A 
COBR 
11-42 
cmpibg 
srcl, 
src2, 
targ 
39 
COBR 
11-42 
cmpibge 
srcl, 
src2, 
targ 
3B 
COBR 
11-42 
cmpibl 
src1 , 
src2, 
targ 
3C 
COBR 
11-42 
cmpible 
src1 , 
src2, 
targ 
3E 
COBR 
11-42 
cmpibne 
src1 , 
src2, 
targ 
3D 
COBR 
11-42 
cmpibno 
srcl, 
src2, 
targ 
38 
COBR 
11-42 
cmpibo 
srcl, 
src2, 
targ 
3F 
COBR 
11-42 
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cmpinci 
srcl, 
src2, 
dst 
5A5 
REG 
11-37 
cmpinco 
srcl, 
src2, 
dst 
5A4 
REG 
11-37 
cmpo 
srcl, 
src2 
5AO 
REG 
11-35 
cmpobe 
src1 , 
src2, 
targ 
32 
COBR 
11-42 
cmpobg 
srcl, 
src2, 
targ 
31 
COBR 
11-42 
cmpobge 
srcl, 
src2, 
targ 
33 
COBR 
11-42 
cmpobl 
srcl, 
src2, 
targ 
34 
COBR 
11-42 
cmpoble 
srcl, 
src2, 
targ 
36 
COBR 
11-42 
cmpobne 
srcl, 
src2, 
targ 
35 
COBR 
11-42 
cmpor 
srcl, 
src2 
684 
REG 
11-38 
cmporl 
src1, 
src2 
694 
REG 
11-38 
cmpr 
srcl, 
src2 
685 
REG 
11-40 
cmprl 
srcl, 
src2 
695 
REG 
11-40 
concmpi 
srcl, 
src2 
5A3 
REG 
11-45 
concmpo 
' srcl, 
src2 
5A2 
REG 
11-45 
cosr 
src, 
dst 
68D 
REG 
11-46 
cosrl 
src, 
dst 
69D 
REG 
11-46 
cpyrsre 
src1, 
src2, 
dst 
6E3 
REG 
11-48 
cpysre 
srcl, 
src2, 
dst 
6E2 
REG 
11-48 
cvtilr 
src, 
dst 
675 
REG 
11-49 
cvtir 
src, 
dst 
674 
REG 
11-49 
cvtri 
src, 
dst 
6CO 
REG 
11-50 
cvtril 
src, 
dst 
6Cl 
REG 
11-50 
cvtzri 
src, 
dst 
6C2 
REG 
11-50 
cvtzril 
src, 
dst 
6C3 
REG 
11-50 
daddc 
srcl, 
src2, 
dst 
642 
REG 
11-52 
divi 
srcl, 
src2, 
dst 
74B 
REG 
11-53 
divo 
srcl, 
src2, 
dst 
70B 
REG 
11-53 
divr 
srcl, 
src2, 
dst 
78B 
REG 
11-54 
divrl 
srcl, 
src2, 
dst 
79B 
REG 
11-54 
dmovt 
src, 
dst 
644 
REG 
11-56 
dsubc 
srcl, 
src2, 
dst 
643 
REG 
11-57 
ediv 
srcl, 
src2, 
dst 
671 
REG 
11-58 
ernul 
srcl, 
src2, 
dst 
670 
REG 
11-59 
expr 
src, 
dst 
689 
REG 
11-60 
exprl 
SIT, 
dst 
699 
REG 
11-60 
extract 
bitpos, 
len, 
src/dst 
651 
REG 
11-62 
faulte 
lA 
CTRL 
11-63 
faultg 
19 
CTRL 
11-63 
faultge 
IB 
CTRL 
11-63 
faultl 
lC 
CTRL 
11-63 
faultle 
IE 
CTRL 
11-63 
faultne 
10 
CTRL 
11-63 
faultno 
18 
CTRL 
11-63 
faulto 
IF 
CTRL 
11-63 
flushreg 
66D 
REG 
11-65 
fmark 
66C 
REG 
11-66 


3-289 


intel 
80960KB 
PROGRAMMER'S 
REFERENCE 


Mnemonic 
Operands 
Opcode 
Inst. Type 
Page 


Id 
src, 
dst 
90 
MEM 
11-67 
Ida 
src 
dst 
8C 
MEM 
11-69 
Idib 
src, 
dst 
CO 
MEM 
11-67 
Idis 
src, 
dst 
C8 
MEM 
11-67 
Idl 
src, 
dst 
98 
MEM 
11-67 
Idob 
src, 
dst 
80 
MEM 
11-67 
Idos 
src, 
dst 
88 
MEM 
11-67 
Idq 
src, 
dst 
BO 
MEM 
11-67 
Idt 
src, 
dst 
AO 
MEM 
11-67 
logbnr 
src, 
dst 
68A 
REG 
11-70 
logbnrl 
src, 
dst 
69A 
REG 
11-70 
logepr 
srcl, 
src2, 
dst 
681 
REG 
11-72 
logeprl 
srcl, 
src2, 
dst 
691 
REG 
11-72 
logr 
srcl, 
src2, 
dst 
682 
REG 
11-75 
logrl 
srcl, 
src2, 
dst 
692 
REG 
11-75 
mark 
66B 
REG 
11-78 
modac 
mask, 
src, 
dst 
645 
REG 
11-79 
modi 
srcl, 
src2, 
dst 
749 
REG 
11-80 
modify 
mask, 
src, 
src/dst 
650 
REG 
11-81 
modpc 
src 
mask, 
src/dst 
655 
REG 
11-82 
modtc 
mask, 
src, 
dst 
654 
REG 
11-84 
mov 
src, 
dst 
5CC 
REG 
11-85 
movl 
src, 
dst 
5DC 
REG 
11-85 
movq 
src, 
dst 
5FC 
REG 
11-85 
movr 
src, 
dst 
6C9 
REG 
11-86 
movre 
src, 
dst 
6E9 
REG 
11-86 
movrl 
src, 
dst 
6D9 
REG 
11-86 
movt 
src, 
dst 
5EC 
REG 
11-85 
muli 
srcl, 
src2, 
dst 
741 
REG 
11-88 
mulo 
srcl, 
src2, 
dst 
701 
REG 
11-88 
mulr 
srcl, 
src2, 
dst 
78C 
REG 
11-89 
mulrl 
srcl, 
src2, 
dst 
79C 
REG 
11-89 
nand 
srcl, 
src2, 
dst 
58E 
REG 
11-91 
nor 
srcl, 
src2, 
dst 
588 
REG 
11-92 
not 
src, 
dst 
58A 
REG 
11-93 
notand 
src, 
dst 
584 
REG 
11-93 
notbit 
bitpos, 
src, 
dst 
580 
REG 
11-94 
notor 
srcl, 
src2, 
dst 
58D 
REG 
11-95 
or 
srcl, 
src2, 
dst 
587 
REG 
11-96 
ornot 
srcl, 
src2, 
dst 
58B 
REG 
11-96 
remi 
srcl, 
src2, 
dst 
748 
REG 
11-97 
remo 
srcl, 
src2, 
dst 
708 
REG 
11-97 
remr 
srcl, 
src2, 
dst 
683 
REG 
11-98 
remrl 
srcl, 
src2, 
dst 
693 
REG 
11-98 
ret 
OA 
CTRL 
11-101 
rotate 
len, 
src, 
dst 
59D 
REG 
11-103 
roundr 
src, 
dst 
68B 
REG 
11-104 
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roundrl 
src, 
dst 
69B 
REG 
11-104 
scaler 
srcl, 
src2, 
dst 
677 
REG 
11-105 
scalerl 
srcl, 
src2, 
dst 
676 
REG 
11-105 
scanbit 
src, 
dst 
641 
REG 
11-107 
scan byte 
srcl, 
src2 
5AC 
REG 
11-108 
setbit 
bitpos, 
src, 
dst 
583 
REG 
11-109 
shli 
len, 
src, 
dst 
59E 
REG 
11-110 
shlo 
len, 
src, 
dst 
59C 
REG 
11-110 
shrdi 
len, 
src, 
dst 
59A 
REG 
11-110 
shri 
len, 
src, 
dst 
59B 
REG 
11-110 
shro 
len, 
src, 
dst 
598 
REG 
11-110 
sinr 
src, 
dst 
68C 
REG 
11-112 
sinrl 
src, 
dst 
69C 
REG 
11-112 
spanbit 
src, 
dst 
640 
REG 
11-114 
sqrtr 
src, 
dst 
688 
REG 
11-115 
sqrtrl 
src, 
dst 
698 
REG 
11-115 
st 
src, 
dst 
92 
MEM 
11-117 
stib 
src, 
dst 
C2 
MEM 
11-117 
stis 
src, 
dst 
CA 
MEM 
11-117 
stl 
src, 
dst 
9A 
MEM 
11-117 
stob 
src, 
dst 
82 
MEM 
11-117 


stos 
src, 
dst 
8A 
MEM 
11-117 
stq 
src, 
dst 
B2 
MEM 
11-117· 


stt 
src, 
dst 
A2 
MEM 
11-117 
subc 
srcl, 
src2, 
dst 
5B2 
REG 
11-119 
subi 
srcl, 
src2, 
dst 
593 
REG 
11-120 
subo 
srcl, 
src2, 
dst 
592 
REG 
11-120 
subr 
srcl, 
src2, 
dst 
78D 
REG 
11-121 
subrl 
srcl, 
src2, 
dst 
79D 
REG 
11-121 
syncf 
66F 
REG 
11-123 
synld 
src, 
dst 
615 
REG 
11-124 
synmov 
dst, 
src 
600 
REG 
11-126 
synmovl 
dst, 
src 
601 
REG 
11-126 
synmovq 
dst, 
src 
602 
REG 
11-126 
tanr 
src, 
dst 
68E 
REG 
11-129 
tanrl 
src, 
dst 
69E 
REG 
11-129 
teste 
dst 
22 
COBR 
11-131 
testg 
dst 
21 
COBR 
11-131 
testge 
dst 
23 
COBR 
11-131 
testl 
dst 
24 
COBR 
11-131 
testle 
dst 
26 
COBR 
11-131 
testne 
dst 
25 
COBR 
11-131 
testno 
dst 
20 
COBR 
11-131 
testo 
dst 
27 
COBR 
11-131 
xnor 
srcl, 
src2, 
dst 
589 
REG 
11-133 
xor 
srcl, 
src2, 
dst 
586 
REG 
11-133 
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08 
CTRL 
b 
targ 
11-18 
09 
CTRL 
call 
targ 
11-25 
OA 
CTRL 
ret 
11-101 
OB 
CTRL 
bal 
targ 
11-16 
10 
CTRL 
bno 
targ 
11-22 
11 
CTRL 
bg 
targ 
11-22 
12 
CTRL 
be 
targ 
11-22 
13 
CTRL 
bge 
targ 
11-22 
14 
CTRL 
bl 
targ 
11-22 
15 
CTRL 
bne 
targ 
11-22 
16 
CTRL 
ble 
targ 
11-22 
17 
CTRL 
bo 
targ 
11-22 
18 
CTRL 
fauJtno 
11-63 
19 
CTRL 
fauJtg 
11-63 
lA 
CTRL 
fauJte 
11-63 
IB 
CTRL 
fauJtge 
11-63 
lC 
CTRL 
faultl 
11-63 
10 
CTRL 
fauJtne 
11-63 
IE 
CTRL 
faultle 
11-63 
1F 
CTRL 
fauJto 
11-63 
20 
COBR 
testno 
dst 
11-131 
21 
COBR 
testg 
dst 
11-131 
22 
COBR 
teste 
dst 
11-131 
23 
COBR 
testge 
dst 
11-131 
24 
COBR 
testl 
dst 
11-131 
25 
COBR 
testne 
dst 
11-131 
26 
COBR 
testle 
dst 
11-131 
27 
COBR 
testo 
dst 
11-131 
30 
COBR 
bbc 
bitpos, 
src, 
targ 
11-20 
31 
COBR 
cmpobg 
srcl, 
src2, 
targ 
11-18 
32 
COBR 
cmpobe 
srcl, 
src2, 
targ 
11-42 
33 
COBR 
cmpobge 
srcl, 
src2, 
targ 
11-42 
34 
COBR 
cmpobl 
srcl, 
src2, 
targ 
11-42 
35 
COBR 
cmpobne 
srcl, 
src2, 
targ 
11-42 
36 
COBR 
cmpoble 
srcl, 
src2, 
targ 
11-42 
37 
COBR 
bbs 
bitpos, 
src, 
targ 
11-20 
38 
COBR 
cmpibno 
sr:cl, 
src2, 
targ 
11-42 
39 
COBR 
cmpibg 
srcl, 
src2, 
targ 
11-42 
3A 
COBR 
cmpibe 
srcl, 
src2, 
targ 
11-42 
3B 
COBR 
cmpibge 
srcl, 
src2, 
targ 
11-42 
3C 
COBR 
cmpibl 
srcl, 
src2, 
targ 
11-42 
3D 
COBR 
cmpibne 
srcl, 
src2, 
targ 
11-42 
3E 
COBR 
cmpible 
srcl, 
src2, 
targ 
11-42 
3F 
COBR 
cmpibo 
srcl, 
src2, 
targ 
11-42 
80 
MEM 
Idob 
src, 
dst 
11-67 
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82 
MEM 
stob 
src, 
dst 
11-117 
84 
MEM 
bx 
targ 
11-18 
85 
MEM 
balx 
targ, 
dst 
11-16 
86 
MEM 
calix 
targ 
11-29 
88 
MEM 
Idos 
src, 
dst 
11-67 
8A 
MEM 
stos 
src, 
dst 
11-117 
8C 
MEM 
Ida 
src 
dst 
11-69 
90 
MEM 
Id 
src, 
dst 
11-67 
92 
MEM 
st 
src, 
dst 
11-117 
98 
MEM 
Idl 
src, 
dst 
11-67 
9A 
MEM 
stl 
src, 
dst 
11-117 
AO 
MEM 
Idt 
src, 
dst 
11-67 
A2 
MEM 
stt 
src, 
dst 
11-117 
BO 
MEM 
Idq 
src, 
dst 
11-67 
B2 
MEM 
stq 
src, 
dst 
11-117 
CO 
MEM 
Idib 
src, 
dst 
11-67 
C2 
MEM 
stib 
src, 
dst 
11-117 
C8 
MEM 
Idis 
src, 
dst 
11~67 
CA 
MEM 
stis 
src, 
dst 
11-117 
580 
REG 
notbit 
bitpos, 
src, 
dst 
11-94 
581 
REG 
and 
srcl, 
src2, 
dst 
11-11 
582 
REG 
andnot 
srcl, 
src2, 
dst 
11-11 
583 
REG 
setbit 
bitpos, 
src, 
dst 
11-109 
584 
REG 
notand 
src, 
dst 
11-93 
586 
REG 
xor 
srcl, 
src2, 
dst 
11-133 
587 
REG 
or 
srcl, 
src2, 
dst 
11-96 
588 
REG 
nor 
srcl, 
src2, 
dst 
11-92 
589 
REG 
xnor 
srcl, 
src2, 
dst 
11-133 
58A 
REG 
not 
src, 
dst 
11-93 
58B 
REG 
ornot 
srcl, 
src2, 
dst 
11-96 
58C 
REG 
cIrbit 
bitpos, 
src, 
dst 
11-34 
58D 
REG 
notor 
srcl, 
src2, 
dst 
11-95 
58E 
REG 
nand 
srcl, 
src2, 
dst 
11-91 
58F 
REG 
alterbit 
bitpos, 
src, 
dst 
11-10 
590 
REG 
addo 
srcl, 
src2, 
dst 
11-7 
591 
REG 
addi 
srcl , 
src2, 
dst 
11-7 
592 
REG 
subo 
srcl, 
src2, 
dst 
11-120 
593 
REG 
subi 
srcl, 
src2, 
dst 
11-120 
598 
REG 
shro 
len, 
src, 
dst 
11-110 
59A 
REG 
shrdi 
len, 
src, 
dst 
11-110 
59B 
REG 
shri 
len, 
src, 
dst 
11-110 
59C 
REG 
shlo 
len, 
src, 
dst 
11-110 
59D 
REG 
rotate 
len, 
src, 
dst 
11-103 
59E 
REG 
shli 
len, 
src, 
dst 
11-110 
5AO 
REG 
cmpo 
srcl, 
src2 
11-35 
5Al 
REG 
cmpi 
srcl, 
src2 
11-35 
5A2 
REG 
concmpo 
srcl, 
src2 
11-45 
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5A3 
REG 
concmpi 
srel, 
sre2 
11-45 
5A4 
REG 
cmpinco 
srel, 
sre2, 
dst 
11-37 
5A5 
REG 
cmpinci 
srel, 
sre2, 
dst 
11-37 
5A6 
REG 
cmpdeco 
srel, 
sre2, 
dst 
11-36 
5A7 
REG 
cmpdeci 
srel, 
sre2, 
dst 
11-36 
5AC 
REG 
scanbyte 
srel, 
sre2 
11-108 
5AE 
REG 
chkbit 
bitpos, 
sre 
11-31 
5BO 
REG 
addc 
srel, 
sre2, 
dst 
11-6 
5B2 
REG 
subc 
srel, 
sre2, 
dst 
11-119 
5CC 
REG 
mov 
sre, 
dst 
11-85 
5DC 
REG 
movl 
sre, 
dst 
11-85 
5EC 
REG 
movt 
sre, 
dst 
11-85 
5FC 
REG 
movq 
sre, 
dst 
11-85 
600 
REG 
synmov 
dst, 
sre 
11-126 
601 
REG 
synmovl 
dst, 
sre 
11-126 
602 
REG 
synmovq 
dst, 
sre 
11-126 
610 
REG 
at mod 
sre, 
mask, 
src/dst 
11-15 
612 
REG 
atadd 
src/dst, 
sre, 
dst 
11-12 
615 
REG 
synld 
sre, 
dst 
11-124 
640 
REG 
spanbit 
sre, 
dst 
11-114 
641 
REG 
scanbit 
sre, 
dst 
11-107 
642 
REG 
daddc 
srel, 
sre2, 
dst 
11-52 
643 
REG 
dsubc 
srel, 
sre2, 
dst 
11-57 
644 
REG 
dmovt 
sre, 
dst 
11-56 
645 
REG 
modac 
mask, 
sre, 
dst 
11-79 
650 
REG 
modify 
mask, 
sre, 
src/dst 
11-81 
651 
REG 
extract 
bitpos, 
len, 
sre/dst 
11-62 
654 
REG 
modtc 
mask, 
sre, 
dst 
11-84 
655 
REG 
modpc 
mask, 
src/dst 
11-82 
660 
REG 
calls 
targ 
11-27 
66B 
REG 
mark 
11-78 
66C 
REG 
fmark 
11-66 
66D 
REG 
flushreg 
11-65 
66F 
REG 
syncf 
11-123 
670 
REG 
ernul 
srcl , 
sre2, 
dst 
11-59 
671 
REG 
ediv 
srel, 
sre2, 
dst 
11-58 
674 
REG 
cvtir 
sre, 
dst 
11-49 
675 
REG 
cvtilr 
sre, 
dst 
11-49 
676 
REG 
scalerl 
srel, 
sre2, 
dst 
11-105 
677 
REG 
scaler 
srcl, 
sre2, 
dst 
11-105 
680 
REG 
atanr 
srel, 
sre2, 
dst 
11-13 
681 
REG 
logepr 
srel, 
sre2, 
dst 
11-72 
682 
REG 
logr 
srel, 
sre2, 
dst 
11-75 
683 
REG 
remr 
srel, 
sre2, 
dst 
11-98 
684 
REG 
cmpor 
srel, 
sre2 
11-38 
685 
REG 
cmpr 
srcl , 
sre2 
11-40 
688 
REG 
sqrtr 
sre, 
dst 
11-115 
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689 
REG 
expr 
src, 
ds! 
11-60 
68A 
REG 
logbnr 
src, 
ds! 
11-70 
68B 
REG 
roundr 
src, 
ds! 
11-104 
68C 
REG 
sinr 
src, 
ds! 
11-112 
68D 
REG 
cosr 
src, 
ds! 
11-46 
68E 
REG 
tanr 
src, 
ds! 
11-129 
68F 
REG 
classr 
src 
11-32 
690 
REG 
atanrl 
srcl, 
src2, 
ds! 
11-13 
691 
REG 
logeprl 
srcl, 
src2, 
ds! 
11-72 
692 
REG 
logrl 
srcl, 
src2, 
ds! 
11-75 
693 
REG 
remrl 
srcl, 
src2, 
ds! 
11-98 
694 
REG 
cmporl 
src1 , 
src2 
11-38 
695 
REG 
cmprl 
srcl, 
src2 
11-40 
698 
REG 
sqrtrl 
src, 
ds! 
11-115 
699 
REG 
exprl 
src, 
ds! 
11-60 
69A 
REG 
logbnrl 
src, 
ds! 
11-70 
69B 
REG 
roundrl 
src, 
ds! 
11-104 
69C 
REG 
sinrl 
src, 
ds! 
11-112 
69D 
REG 
cosrl 
src, 
ds! 
11-46 
69E 
REG 
tanrl 
src, 
ds! 
11-129 
69F 
REG 
classrl 
src 
11-32 
6CO 
REG 
cvtri 
src, 
ds! 
11-50 
6C1 
REG 
cvtril 
src, 
ds! 
11-50 
6C2 
REG 
cvtzri 
src, 
ds! 
11-50 
6C3 
REG 
cvtzril 
src, 
ds! 
11-50 
6C9 
REG 
movr 
src, 
ds! 
11-86 
6D9 
REG 
movrl 
src, 
ds! 
11-86 
6E2 
REG 
cpysre 
srcl, 
src2, 
ds! 
11-48 
6E3 
REG 
cpyrsre 
srcl, 
src2, 
ds! 
11-48 
6E9 
REG 
movre 
src, 
ds! 
11-86 
701 
REG 
mulo 
srcl, 
src2, 
ds! 
11-88 
708 
REG 
remo 
srcl, 
src2, 
ds! 
11-97 
70B 
REG 
divo 
srcl, 
src2, 
ds! 
11-53 
741 
REG 
muli 
srcl, 
src2, 
ds! 
11-88 
748 
REG 
remi 
srcl, 
src2, 
ds! 
11-97 
749 
REG 
modi 
srcl, 
src2, 
ds! 
11-80 
74B 
REG 
divi 
srcl, 
src2, 
ds! 
11-53 
78B 
REG 
divr 
srcl, 
src2, 
ds! 
11-54 
78C 
REG 
mulr 
srcl, 
src2, 
ds! 
11-89 
78D 
REG 
subr 
srcl, 
src2, 
ds! 
11-121 
78F 
REG 
addr 
srcl, 
src2, 
ds! 
11-8 
79B 
REG 
divrl 
srcl, 
src2, 
ds! 
11-54 
79C 
REG 
mulrl 
srcl, 
src2, 
ds! 
11-89 
79D 
REG 
subrl 
srcl, 
src2, 
ds! 
11-121 


79F 
REG 
addrl 
srcl, 
src2, 
ds! 
11-8 
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The following 
pages 
provide 
a collection 
of the system 
data structures 
presented 
in this 


manual. 
They are are grouped by function. 
The chapter reference 
below each data structure 
shows where in this manual this data structure is described. 


'11 
1 
1 


L-.-....J 
L---J 


1 
l-.- 


I 
. 


I 


l. 


, 


CONDITION CODE 


ARITHMETIC STATUS 


NTEGEROVERfLOW FLAG 


NTEGEROVERFLOW MASK 


NO IMPRECISEFAULTS 


FLOATING OVERFLOW FLAG 


FLOATING UNDERFLOW FLAG 


FLOATING INVALID·OP flAG 


FLOATING ZERO·DIVIDE flAG 


FLOATING INEXACT FLAG 


FLOATlNG OVERFLOW MASK 


FLOATING UNDERflOW 
MASK 


flOATING 
INVALID-DP MASK 


flOATING 
ZERD·DIVIDE MASk 


FLOATlNG INEXACT MASk 


fLOATING·POINT 
NORMALIZING MODE 


FLOATING·POINT ROUNDING CONTROL 


inter 


CONTENTS 
OF 


GLOBAL 
AND 
FLOATING-POINT 


REGISTERS 
PRESERVED 
ACROSS 
PROCEDURE 
BOUNDARIES 


NEW 
SET OF 


LOCAL 
REGISTERS 
ALLOCATED 


FOR EACH 


PROCEDURE 


REGISTERS 
gO THROUGH 
g14 
AVAILABLE 
FOR GENERAL 
USE 
GLOBAL 
REGISTERS 


I 
FLOATING-POINT 


REGISTERS 


fp3 
L.- 
-----l ~ 


PREVIOUS 
FRAME 
POINTER 
(PFP) 


STACK 
POINTER 
(SP) 


RETURN 
INSTRUCTION 
POINTER 
(RIP) 


REGISTERS 
r4 THROUGH 
r1 S 
AVAILABLE 
FOR GENERAL 
USE 


, 


LOCAL 
REGISTERS 


inter 


PREVIOUS 
FRAME 


P 
RRR 
rO 
r1 
r2 


CURRENT 
FRAME 


PREVIOUS FRAME POINTER (PFP) 
P 
RRR 
rO 
STACK POINTER (SP) 
r1 


RETURN INSTRUCTION POINTER (RIP) 
r2 


n+64 
STACK 
GROWS 
FROM LOW 
ADDRESSES 
TO HIGH 
ADDRESSES 


THE CURRENT FRAME 
POINTER (FP) STORED 
IN 9 15 POINTS TO 
THIS WORD IN THE 
STACK. 


inter 


2120 
I 


1615141312111098 
2 
1 0 


~ 


1 
It 
TRACE ENABLE 


EXECUTION 
MODE 


-------- 
RESUME 


TRACE·FAUL 
T PENDING 


STATE 


PRIORITY 


INTERNAL 
STATE 


_ 
RESERVED 
(INITIALIZE 
TO 0) 


inte!" 


CHECK·SUM WORDS 
PHYSICAL 
SYSTEM ADDRESS TABLE (SAT) 
OFFSET 
AOORESSES 


SAT POINTER 
0 
• 
0 


PRCBPOINTER 
4 
l 
CHECK WORD 
B 
136 
INSTRUCTION POINTER 
12 


4 CHECK WORDS 
16 
140 


20 
144 


24 
148 


28 


SYSTEM 
PROCEDURE 
POINTER 
152 


3044 OOFB16 
156 


PROCESSOR CONTROL BLOCK 
(PRCB) 
OffSET 


0 


4 


8 


12 


20 


24 


28 


סס oo 027F16 
32 


סס oo 027F16 
36 


FAULT TABLE POINTER 
40 


סס ooסס oo16 
44 


48 


76 


80 
SCRATCH SPACE 


172 


1:/*1 
RESERVED (INITIAlIZETO 
0) 


~~~ 
PRESERVED 
l_r 
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80960KB 
PROGRAMMER'S 
REFERENCE 


Interrupt Handling 


31 
0 


PENDING PRIORITIES 
0 


4 


PENDING INTERRUPTS 


32 


ENTRY 8 
36 
(VECTOR 8) 


ENTRY 9 
40 
(VECTOR 9) 


ENTRY 10 
44 
(VECTOR 10) 


976 
(VECTOR 243) 


980 
(VECTOR 244) 


992 
(VECTOR 247) 


996 
(VECTOR 248) 


1000 
(VECTOR 249) 


1008 
(VECTOR 251) 


1012 
(VECTOR 252) 


ENTRY 255 
1024 
(VECTOR 255) 


PROCEDURE ENTRY FORMAT 
31 
2 1 0 
I 
INSTRUCTION POINTER 
G2J 
- 


RESERVED ( INITIALIZE TO 0) 


inter 


RESUMPTION RECORD 
FORSUSPENDEDINSTRUCTION 
(OPTIONAL) 
~ 


INTERRUPT 


RECORD 


SAVED PROCESSCONTROLS 
NFP-16 


SAVED ARITHMETIC CONTROLS 
NFP-12 


0 ... JI~ 
VECTOR NUMBER 


*If the interrupt 
is serviced while the processor is working 
on another 


interrupt 
procedure. 
the new stack pointer 
(NSP) will be the same as 
the SP..~. 
GfuYi®LA 
RESERVED 


MESSAGE TYPE 
I 
FIELD 1 
I 
FIELD2 
-~ 


FIELD3 
0 


FIELD4 


FIELD 5 


31 
0 


0 


4 


12 


16 


24 


28 


PROCESSCONTROLS 
32 


ARITHMETIC CONTROLS 
36 


FAUlTFlAGS 
FAULT TYPE 
FAULT SUBTYPE 
40 


ADDRESS OF FAULTING INSTRUCTION 
44 
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o 


TRACE FAULT ENTRY 
8 


OPERATION FAULT ENTRY 
16 


ARITHMETIC FAULT ENTRY 
24 


FLOATING-POINT FAULT ENTRY 
32 


CONSTRAINT FAULT ENTRY 
40 
*tfii~~~ili~~~.~:lif:~~~;·;~·j~.iiE~~illl~~;l11~ltl1tl 
•• 
~WIr.\~{f~l%i~ff:mJ.j~~~iJ 
48 


PROTECTION FAULT ENTRY 
56 


MACHINE FAULT ENTRY 
64 


72 


TYPE FAULT ENTRY 
80 


88 


96 


104 


112 


120 


n 


n+4 


FAULT-HANDLER PROCEDURENUM8ER 


0000027F,6 


o 
n 


n+4 


Ji.;;i@fMWfiJ 
RESERVED(INITIALIZE 
TO 0) 


31 
23222120191817 
7 6 5 4 3 2 1 0 
~.-_- 
....-~..IGJJIIJJI 
~E 


LINSTRucnON 
TRACE MODE 


BRANCH 
TRACE MODE 


CALL TRACE MODE 


RETURN 
TRACE MODE 


'------'--- 
PRERETURN 
TRACE 
MODE 


'------- 
SUPERVISOR 
TRACE 
MODE 


'-------- 
BREAKPOINT 
TRACE 
MODE 


'---------------INSTRUCTION 
TRACE 
EVENT 


'---------------- 
BRANCH 
TRACE EVENT 


CALL TRACE EVENT 


RETURN 
TRACE EVENT 


PRERETURN 
TRACE EVENT 


SUPERVISOR 
TRACE 
EVENT 


~------------------ 
BREAKPOINT 
TRACE 
EVENT 


••• 
RESERVED 
( INITIALIZE 
TO 0) 


APPENDIX 
B 
MACHINE-LEVEL 
INSTRUCTION 
FORMATS 


This appendix 
describes 
the machine-level 
format for 80960KB 
instructions. 
Included 
is a 
description 
of the four instruction 
formats 
and how the addressing 
modes 
relate 
to these 
formats. 
Also, a table is given that shows the relationship 
between the machine-level 
instruc- 
tion operands and the assembly-language-level 
instruction operands. 


At the machine-level, 
all the 80960KB 
instructions 
are one word long and begin on word 
boundaries. 
(One 
group 
of instructions 
allows 
a second 
word, 
which 
contains 
a 32-bit 
displacement. ) 


There are four basic instruction 
formats: REG, COBR, CTRL, and MEM. 
Figure B-1 shows 


these formats. 
Each instruction 
has only one format, which is defined by the opcode field of 
the instruction. 


31 
2423 
1918 
1413121110 
7 
6 
5 
4 
0 
REG 
I 
OPCODE 
I 
SRCIOST 
1 
SRC2 
1 I I I 
OPCODE 
10 
01 
SRC1 
I 


t t 
t 
M1 


M2 


M3 


31 
2423 
1918 
14 
13 
12 
2 
1 
0 
COBR 
OPCODE 
1 
SRC1 
SRC2 
I I 
DISPLACEMENT 
10 
0 I 


t 
M1 


31 
2423 
2 
1 
0 
I 
OPCODE 
DISPLACEMENT 
10 
0 I 
CTRL 


31 
2423 
1918 
14 
13 
12 
11 
0 
I 
OPCODE 
SRClDST 
I 
10 I 
MEMA 
A8ASE 
OFFSET 


t 
MODE 


31 
2423 
19 
18 
1413 
10 
9 
7 
6 
5 
4 
0 
I 
OPCODE 
1 
SRClDST 
I 
A8ASE 
1 
MODE 
1 SCALE 
1 0 
0 
1 
I 
MEMB 
INDEX 
L _______________ 
~~~~~~~~~~~ 
________________ 
J 


Figure B-1: 
Instruction Formats 


The following sections describe the fields in the instruction word for each format. 


The REG format is for operations that are performed on data contained in the global, local, and 
floating-point 
registers. 
The majority of the 80960KB instructions 
use this format. 


The opcode for the REG instructions 
is 12 bits long (3 hexadecimal 
digits) and is split between 
bits 7 through 
10 and bits 24 through 31. 
For example, the opcode for the addi instruction 
is 
59116, 
Here, 5916 is contained in bits 24 through 31 and 116 is contained in bits 7 through 10. 


The src 1 and src2 fields specify source operands 
for the instruction. 
The operands 
can be 
either registers or literals. 
The mode bits (m1 for src1 and m2 for src2) and the instruction type 
(non-floating 
point or floating 
point) determine 
whether an operand 
is a register or a literal. 
Table B-1 shows the relationship 
between the instruction 
type, the mode bits, and the src1 and 
src2 operands. 


Inst. Type 
Ml or M2 
Src1 or Src2 
Register 
Literal 
Operand 
Number 
Value 
Value 


Non-FP 
0 
00000 
rO 


01111 
I: 
r15 
10000 
gO 


1-: 


11111 
g15 
1 
00000 
0 


: 
11111 
31 
FP 
0 
00000 
rO 


01111 
r15 
10000 
gO 


11111 
g15 
1 
00000 
fpO 


00011 
fp3 
00100 to 
reserved 
01111 
10000 
+0.0 
10001 to 
reserved 
10101 
10110 
+1.0 
10111 to 
reserved 
11111 


For non-floating-point 
instructions, 
if a mode bit is set to 0, the respective 
srcl or src2 field 
specifies 
a global or local register. 
If the mode bit is set to I, the field specifies 
an ordinal 
literal in the range of 0 to 31. 


For floating-point 
instructions, 
if the mode bit is set to 0, the respective 
src I or src2 field 
specifies 
a global or local register (just as it does for non-floating-point 
instructions). 
If the 
mode bit is set to 1, the field specifies either a floating-point 
register or one of two real-number 
literals (+0.0 or + 1.0). 
All of the other encoding 
when the mode bit is set to I are reserved. 
When a reserved encoding 
is used as a source, the processor 
either signals an invalid opcode 
fault or produces an undefined value. 


The src/dst field can specify either a source operand or a destination 
operand or both, depend- 
ing on the instruction. 
Here again, the mode bit (m3) and the instruction 
type (non-floating 
point or floating point) determine how this field is used. Table B-2 shows this relationship. 


Table B-2: 
Encoding of Src/Dst Field in REG Format 


Inst. Type 
m3 
Src/Dst 
Src Only 
Dst Only 


Non-FP 
0 
gO .. gI5 
gO .. gl5 
gO .. gl5 
rO .. rl5 
rO .. rl5 
fO .. rl5 


I 
NA 
Literal 
NA 
FP 
0 
NA 
NA 
gO .. gI5 
rO .. rl5 


I 
NA 
NA 
fpO .. fp4 


For non-floating-point 
instructions, 
if M3 is clear, the src/dst operand 
is a global or local 
register that is encoded as shown in Table B-1. 
If M3 is set, the src/dst operand can be used 
only as a src operand that is an ordinal literal. 


For floating-point 
instructions, 
the src/dst field is only used to encode destination 
operands. 
Here, the encoding 
is the same as shown in Table B-1, except that the encodings 
for floating- 
point literals are not allowed. 
That is, if M3 is clear, the destination 
operand 
is a global or 
local register; if M3 is set, the destination operand is a floating-point 
register. 
When a reserved 
encoding 
or literal encoding 
is used as a destination, 
the processor 
either signals an invalid 
opcode fault or produces an undefined result. 


The COBR format is used primarily 
for control-and-branch 
instructions. 
(The test-if instruc- 
tions also use this format.) 
The opcode field for this format is 8 bits (two hexadecimal 
digits). 


The src I and src2 fields specify source operands for the instruction. 
The srcl field can specify 
either a global or local register or a literal as determined 
by mode bit m I. (The encoding of the 
srcl field is the same as is shown in Table B-1 for the non-floating 
point instructions.) 
The 
src2 field can only specify a local or global register. 
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The displacement 
field contains 
a signed, 
twos complement 
number 
that specifies 
a word 
displacement. 
The processor uses this value to compute the address of a target instruction 
that 
the processor goes to as the result of a comparison. 
The displacement 
field can range from _210 


to (210 -1). 
To determine 
the IP of the target instruction, 
the processor 
converts the displace- 
ment value to a byte displacement 
(i.e., multiplies 
the value by 4). 
It then adds the resulting 
byte displacement 
to the IP of the next instruction. 


Note 


To allow labels or absolute addresses to be used in the assembly-language 
version of the COBR 
format instructions, 
the Intel 80960KB 
Assembler 
converts 
a targ (target) operand value in an 
assembly-language 
instruction 
into the displacement 
value required for the COBR format, using 
the following calculation: 


For the test-if instructions, 
only the srcl field is used. 
Here, this field specifies a destination 
global or local register (ml is ignored). 


The CTRL 
format 
is used for instructions 
that branch 
to a new IF, including 
the branch, 
branch-if, 
bal, and call instructions. 
The return instruction also uses this format. 
The opcode 
field for this format is 8 bits (two hexadecimal 
digits). 


The instructions 
that use this format have no operands. 
The target address for a branch is 
specified 
with the displacement 
field in the same manner as is done with the COBR format 
instructions. 
Here, the displacement 
field sfJecifies a word displacement 
(also a signed, twos 
complement 
number) that can range from -2 I to 221 -1. 


The MEM format 
is used for instructions 
that require 
a memory 
address 
to be computed. 
These instructions 
include the load, store, and Ida instructions. 
Also, the extended versions of 
the branch, branch-and-link, 
and call instructions 
(bx, balx, and calix) uses this format. 


There are two MEM formats, 
MEMA and MEMB. 
The MEMB format offers the option of 
including a 32-bit displacement 
(contained 
in a second word) to the instruction. 
Bit 12 of the 
first word of the instruction determines whether the format is MEMA (clear) or MEMB (set). 


For both formats the opcode field is 8 bits long. 
The src/dst field specifies a global or local 
register. 
For load instructions, 
the src/dst field specifies 
the destination 
register for a word 
loaded 
into the processor 
from memory 
or, for operands 
larger than one word, the first of 
successive destination 
registers. 
For store instructions, 
this field specifies the register or group 
of registers that contain the source operand to be stored in memory. 


inter 


The mode bit (or bits for the MEMB format) determine the address mode used for the instruc- 
tion. 
Table B-3 summarizes 
the addressing 
modes for the two versions of the MEM format. 
The fields used in these addressing modes are described in the following sections. 


Table B-3: 
Addressing 
Modes for MEM Format Instructions 


Format 
Mode 
Address Computation 
Bit(s) 


MEMA 
0 
offset 


1 
(abase) + offset 


MEMB 
0100 
(abase) 


0101 
(IP) + displacement 
+ 8 


0110 
reserved 


0111 
(abase) + (index) * 2scale 


1100 
displacement 


1101 
(abase) + displacement 


1110 
(index) * 2scale + displacement 


1111 
(abase) + (index) * 2scale + displacement 


Notes: 
I. In the address computations 
above, a field in parentheses 
(e.g., (abase)) 
indicates that the value in the specified 
register is used in the computation. 


2. The use of a reserved encoding 
causes an invalid opcode fault to be signaled. 


The MEMA format provides two addressing modes: 


• 
absolute offset 


• 
register indirect with offset 


The offset field specifies an unsigned byte offset from 0 to 4096. 
The abase field specifies a 
global or local register that contains an address in memory. 
The address is interpreted as either 
a virtual 
address or a physical 
address 
depending 
on whether 
the processor 
is operating 
in 
virtual-addressing 
or physical-addressing 
mode, respectively. 


For the absolute 
offset addressing 
mode (the mode bit is clear), the processor 
interprets 
the 
offset field as an offset from byte 0 of the current process address space. 
The abase field is 
ignored. 
Using this addressing mode along with the Ida instruction allows a constant of from 0 
to 4096 to be loaded into a register. 


For the register indirect with offset addressing 
mode (the md bit is set), the value in the offset 
field is added to the address in the abase register. 
Setting the offset value to zero creates a 
register indirect addressing 
mode, however, 
this operation 
can generally 
be carried out faster 
by using the MEMB version of this addressing mode. 


The MEMB fonnat provides the following seven addressing modes: 


• 
absolute displacement 


• 
register indirect 


• 
register indirect with displacement 


• 
register indirect with index 


• 
register indirect with index and displacement 


• 
index with displacement 


• 
IP with displacement 


The abase and index fields specify local or global registers, the contents of which are used in 
the address computation. 
When the index field is used in an addressing 
mode, the processor 
automatically 
scales the value in the index register by the amount specified in the scale field. 


Table B-4 gives the encoding of the scale field. The optional displacement 
field is contained in 
the word following 
the instruction 
word. 
The displacement 
is a 32-bit, signed, twos comple- 
ment value. 


Scale 
Scale Factor 
(Multiplier) 


000 
1 


001 
2 


010 
4 


011 
8 


100 
16 


101 to III 
reserved 


Note: 
The use of a reserved encoding 
causes 
an invalid opcode fault to be signaled. 


For the IP with displacement 
mode, the value of the displacement 
field plus 8 is added to the 
aadress of the current instruction. 
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INSTRUCTION 
TIMING 


This appendix 
describes 
the 80960KB 
processor's 
instruction 
pipeline and how it affects the 


timing of instructions. 
The number of clock cycles required for each instruction are also given 


here. 


The 80960 
architecture 
defines 
several 
mechanisms 
for increasing 
processor 
performance 
through the use of pipelining 
and parallel execution 
of instructions. 
This appendix describes 


how these mechanisms 
have been incorporated 
into the design of the 80960KB 
processor 
and 
provides information to help programmers 
maximize the performance 
of the processor. 


The 80960KB 
processor 
is composed 
of the following 
six major functional 
units (shown in 


Figure C-I): 


• 
Bus Control Logic 


• 
Instruction Fetch Unit and Instruction Cache 


• 
Instruction Decoder 


• 
Micro-Instruction 
Sequencer and ROM 


• 
Instruction Execution Unit 


• 
Floating Point Unit 


inter 


EXTENTION TO THE 80960 
ARCHITECTURE 


FLOATING· 
POINT 
REGISTERS 


GLOBAL 
REGISTERS AND 
lOCAL 
REGISTER 
SETS 


FLOATING· 
INSTRUCTION 
POINT UNIT 
EXECUTION 
UNIT 
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-_...I 
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. 
..• 
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~ 
LOGIC 
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MICRO· 
INSTRUCTION 
INSTRUCTION 
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..• 
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••••••• 
DECODER 
AND ROM 
INSTRUCTION 
CACHE 


EXTERNAL 
BUS 
•••• 


Figure C-1: 
Block Diagram of the 80960KB 
Processor 


These units function independently 
from one another, but in close cooperation. 
The functions 
of each of these units is described in the following sections. 


The Bus Control Logic (BCL) provides 
the interface between the processor 
and the external 


world. 
This interface consists of a multiplexed, 
burst bus, which is capable of memory-access 
rates of over 53 Megabytes/second 
(with a 20 Mhz CPU clock). 
The BCL accepts requests 
from other units within the 80960KB, 
prioritizes 
them, and executes 
them. 
It attempts 
to 
maximize bus access efficiency through buffering and burst accesses. 


The BCL provides a queuing mechanism 
that can buffer up to three outstanding 
requests at any 


given time. 
This mechanism, 
coupled 
with other 80960KB 
features 
(such as score boarding, 
which is discussed 
later), allow other units in the 80960KB 
to continue 
operation 
without 
waiting for bus requests to be completed. 
As a result, the execution of most memory reference 


instructions require little or no delay in the instruction execution pipeline. 
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The BCL generates burst cycles on the external bus, which allow from one to 16 bytes of data 
to be read or written in a single operation. 
The processor 
takes advantage of burst transfers in 
several ways. 
First, multiple-register 
load or store operations can be carried out in a single bus 
operation, 
using the Idl (load long), Idt (load triple), and Idq (load quad) instructions 
and the 
corresponding 
stl (store long) stt (store triple), 
and stq (store quad) instructions. 
Second, 


instructions 
can be fetched 
in 16-byte bursts, 
thereby 
reducing 
bus traffic 
for instruction 
fetches. 
Third, floating-point 
values of 32, 64 or 80 bits can be stored in a single bus opera- 
tion. 


The Instruction 
Fetch Unit (lFU) acts as an intelligent 
"buffer" for the Instruction 
Decoder 
(lD). 
Its purpose is to present the instruction 
stream to the ID in the fastest and most trans- 
parent way possible. 
The IFU uses several mechanisms 
to accomplish 
this goal, as described 
in the following paragraphs. 


The IFU maintains 
a 512 byte, direct-mapped 
instruction 
cache. 
This cache allows very fast 
access to instructions. 
While the other units in the processor 
are executing 
instructions, 
the 
IFU looks ahead in flow of instructions 
stored in the instruction 
cache. 
If a cache miss is 
detected (that is, an instruction that will soon be needed is not in the instruction cache), the IFU 
issues 
a prefetch 
request 
to the BCL. 
Upon receiving 
the requested 
instruction, 
the IFU 
updates the instruction 
cache. 
In most cases, this fetch and load will take place before the ID 
requires the instruction. 
The major exception to this rule happens on branch conditions. 


The IFU works closely with the ID in handling branch conditions. 
The ID informs the IFU of 
any branch operations 
that are about to take place. 
Such notifications 
take place on uncon- 
ditional branches and on conditional 
branches in which the condition code is valid. 
When the 
IFU is notified 
of a branch, 
it checks 
for a cache 
hit on the desired 
instruction. 
If the 
instruction 
is not present, the IFU begins fetching instructions for the new control path. 


To further minimize delays in the instruction pipeline, the ID sends a special signal to the IFU 
whenever 
instructions 
are required immediately. 
The IFU then passes the fetched instructions 
to the ID directly, rather than writing them to the cache and reading them back out again. 
This 


technique is called an instruction-cache 
bypassing. 


The instruction 
pointer (lP) register in the processor 
and the IFU maintain several instruction 
pointers. 
These pointers 
point to instructions 
at various 
stages of the fetch-decode-execute 
pipeline. 
If a fault is signaled from any unit, the processor uses these pointers to determine 
the 
problem and preserve the state of the processor. 


The ID decodes 
the instructions 
it receives from the IFU and routes them to the appropriate 
execution 
units. 
In doing this, it attempts to keep the computing 
resources 
of the processor 
working at the highest possible levels. 


Instructions 
are decoded into the following 
four groups, according 
to how the instructions 
are 
executed: 
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• 
Simple Instructions 


• 
Floating Point and Branch Instructions 


• 
Complex Instructions 


• 
Load and Store Instructions 


The following paragraphs 
list the instructions 
in each of these groups and describe how the ID 
handles them. 


The instructions 
in the simple-instruction 
group require very little decoding. 
These instructions 
include 
logical; 
comparison; 
shift; integer 
add and subtract; 
and ordinal 
add and subtract 
instructions. 
The ID decodes these instructions 
and passes them to the instruction 
execution 
unit (lEU), where they are executed, usually in a single clock period. 


All floating-point 
instructions 
are executed by the floating-point 
unit (FPU). 
Often, the execu- 
tion 
of floating-point 
instructions 
requires 
interaction 
between 
the FPU, 
ID, and Micro- 
Instruction 
Sequencer (MIS). 
For example, the FPU may require access to the general-purpose 
registers 
(maintained 
by the lEU). 
Here, the ID assists in supplying 
data to the FPU. 
Also, 
many of the floating-point 
instructions 
are executed by means of microcode. 
The FPU gets the 
microcode from the MIS. 


The ID executes branch instructions 
directly. If the branches are unconditional, 
no interaction 
with the processor's 
other execution units is required. 


On conditional 
branch instructions, 
the ID uses a condition code scoreboard 
to streamline 
the 
branching 
process. 
Scoreboarding 
is a mechanism 
by which various 
resources 
within 
the 
processor 
can be marked as in use (or pending 
a result). 
When one of the execution 
units in 
the processor 
is in the process 
of altering 
the condition 
code, it marks the condition 
code 
scoreboard. 
When the ID prepares 
to execute a conditional 
branch instruction, 
it checks the 
condition 
code scoreboard. 
If the scoreboard 
is marked as in use, the ID waits for the result 
before 
proceeding. 
If the condition 
code scoreboard 
is clear, the ID signals 
the IFU im- 
mediately if a change in program flow is about to happen. 


Conditional 
fault instructions 
(fault-if instructions) 
are also executed in the ID. 
These opera- 
tions differ from conditional 
branches 
in that they result in a fault event being generated, 
followed by an implicit call to the appropriate fault-handler 
routine. 


As a result of the pipelining 
described 
above, branches can often be carried out in zero clock 
cycles. 
For example, the branch instruction 
(b) shown below will execute in zero cycles, since 
the branch time is overlapped completely 
by the execution time of the floating-point 
instruction 
(sinr). 


sinr 
b 
gO, 
gl 
some 
location 


some 
location: 
mov 
gl,g2 


cmp 
divi 
be 


OxlO, 
r9 


rIO, 
rll, 
rIa 
go_here 


go_here: 
mov 
gl,g2 


Here, the comparison 
instruction 
(cmp) is placed early in the instruction 
stream, allowing the 
branch condition 
based on the value of r9 to take place while the integer divide instruction 
(divi) is being executed. 


Complex 
instructions 
are those that are executed 
using one or more microcode 
instructions. 
Examples 
of such instructions 
are the flushreg (flush local registers), 
mark, and fmark (force 
mark) instructions. 
The ID decodes complex instructions 
and forwards them to the MIS unit. 


The MIS then sends the equivalent microcode to the lEU. 


Load and store instructions 
are those that request data to be read from or written into memory. 
The ID sends these instructions 
directly to the BCL, which executes them. 


The ID is responsible 
for converting 
the addressing 
information encoded in load, store. branch, 
and call instructions 
into an effective memory addresses. 
The circuitry that actually performs 
effective-address 
calculations 
resides in the IFU, but the ID oversees 
these operations. 
The 
generation 
of effective addresses 
is performed 
within a separate carry look-ahead 
adder, used 
with hardware 
shift logic. 
The ability to calculate 
effective 
addresses 
independently 
from 
instruction 
execution 
allows address calculation 
to be overlapped 
with computation. 
The time 
required 
to calculate 
an effective 
address ranges from zero to four cycles; but, for the most 
commonly 
used addressing modes, this time is less than two cycles. 


Instructions 
that require effective 
addresses 
are executed 
by either the ID or the BCL, thus 
preserving 
the pipeline and eliminating 
delays or resource constraints on the lEU.or FPU. 


The MIS is a multipurpose 
unit designed 
to help in the execution 
of instructions 
that use 
microcode. 
All of the processor's 
microcode 
is stored in ROM, which is accessed through the 
MIS. 
When 
the ID receives 
a complex 
instruction 
(one that 
requires 
microcode 
to be 
executed), 
the MIS supplies the microcode 
to the IEU as described earlier in the discussion 
of 
complex instructions. 


The MIS also supplies 
microcode 
for floating-point 
-instructions; 
the power-up 
and self-test 
performed during processor initialization; 
interrupt handling; and fault handling. 


The IEU contains 
the Arithmetic 
Logic 
Unit 
(ALU) 
and the mechanism 
for register 
and 
condition-code 
scoreboarding. 
It also manageS the 16 global registers and the 4 sets of 16 local 
registers. 


• 
Addition and subtraction of integers and ordinals 


• 
Moves between registers 


• 
Logical operations 


• 
Bit operations 


• 
Shifts and rotates 


• 
Comparisons 


The IEU can also work with integer literals in the range of -16 to +31, which are encoded 
in 
the REG instruction format. 
This method of encoding literals performs two functions. 
First, it 
provides a more compact instruction 
stream. 
Second, when a literal is used as an argument for 
an instruction, 
the IEU is able to execute the instruction in one less clock cycle. 


The IEU handles 
the reading 
and writing of global and local registers. 
It also handles 
the 
allocation 
of local registers 
sets on procedure 
calls. 
The IEU allocates 
a new set of local 
registers on each procedure 
call. 
If all four register sets become allocated, 
the lEU automati- 
cally flushes the oldest frame to the stack on the next procedure 
call. 
The IEU also automati- 
cally retrieves any local register frame from the stack when required by a return operation. 
The 
majority 
of procedure 
calls or returns do not require the processor 
to flush local registers 
to 
memory. 
Call instructions 
that can be executed 
without flushing a register set require only 9 
cycles to complete, with the corresponding 
return taking only 7 cycles. 


The register scoreboard 
provides score boarding for the global and local registers. 
When, one 
or more registers 
are being used in an operation, 
they are marked 
as in use. 
The register 
scoreboarding 
mechanism 
allows the processor 
to continue executing 
subsequent 
instructions, 
as long as those instructions 
do not require the contents of the scoreboarded 
registers. 


inter 


A typical event that would cause scoreboarding 
is a load operation. 
For a load from memory, 


the contents 
of the affected 
registers 
are not valid until the BCL fetches 
the data and the 
registers are loaded. 
For example, consider the sequence: 


Id 
addi 
addi 
subi 


gO, 
(gl) 
g2, 
g3, 
g4 
g5, 
g4 
,g6 
gO, 
g6, 
g6 


Here, when the BCL initiates the Id operation, 
register gO is scoreboarded. 
As long as sub- 
sequent instructions 
do not require the contents of gO, the ID continues to dispatch instructions. 


For example, the two addi instructions 
above are executed while the BCL is fetching the data 
for gO. If gO is not loaded by the time the subi instruction 
is ready to be executed, 
the lEU 
delays execution of the instruction until the loading of gO has been completed. 


If an operation 
accesses 
a single register, 
only that register 
is scoreboarded. 
However, 
if 
multiple 
registers 
are accessed 
(such as, with the Idl, lit, or Idq instructions), 
registers 
are 
scoreboarded 
as shown in Table C-l, 
according 
to the base register 
of the the group being 
accessed. 


Base Register 
Block of Registers 
Accessed 
Score boarded 


gO 
0-3 


g2 
0-3 


g4 
0-7 


g6 
0-7 


g8 
8-11 


glO 
8-11 


gl2 
12-15 


gl4 
12-14 


The execution 
times of instructions 
in the lEU are dependent 
on the instruction 
flow. 
Two 
features in the lEU that can enhance the performance 
of instruction execution are: 


• 
Register Bypassing 


• 
Condition Code Scoreboarding 


Register 
Bypassing. 
Register bypassing is a mechanism 
that allows an instruction 
that would 
ordinarily 
require source operands 
to be placed in registers to be executed 
without accessing 
one or both of the source registers. 
Register bypassing occurs in either of two circumstances. 
First, when the lEU executes 
an instruction 
with two source 
operands, 
register 
bypassing 
occurs if one or both of the operands 
are literals. 
Second, register bypassing 
will also occur 
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when the second of two source operands is the result of the previous instruction. 
The net result 
of register bypassing is the saving of one clock cycle. 
Most instructions 
that the lEU executes 
can be executed in a single cycle when register bypassing occurs. 


Condition 
Code Scoreboarding. 
The processor 
requires one clock cycle to set the condition 
code bits as the result of an instruction. 
If one of the instructions 
that follows depends on the 
condition code, condition-code 
scoreboarding 
can be used to save one cycle of execution 
time. 
The following example illustrates this technique: 


Case 1 - 5 cycles 


addc 
mov 
addc 


r4, 
r5, 
rlO 
glO, 
g12 
r6, 
r7, 
rll 


Case 2 - 6 cycles 


addc 
addc 
mov 


r4, 
r5, 
rlO 
r6, 
r7, 
rll 
glO, 
g12 


Here, both Case 1 and Case 2 accomplish 
the same task. 
However, 
Case 2 requires a wait of 
one clock cycle between the first and second addc instruction, 
while the condition code is set. 


Case 1, on the other hand, takes advantage 
of condition 
code scoreboarding 
by executing 
the 
move 
(mov) 
instruction 
while the condition 
code is being 
set. 
The code in Case 
1 thus 
executes one clock cycle faster than the code in Case 2. 


The FPU performs 
all the floating-point 
computations 
for the processor, 
as well as the integer 
multiply and divide operations. 
It also manages the four 80-bit floating-point 
registers, which 
it uses for extended-precision, 
floating-point 
calculations. 


The FPU shares the resources 
of the processor. 
For example, 
it can use the global and local 
registers as operands for floating-point 
operations. 
It also gets microcode 
for the execution 
of 


complex floating-point 
instructions from the MIS. 


To perform 
integer multiplication 
and several floating-point 
calculations, 
the FPU contains 
a 
32-bit integer Booth-Multiplier. 
This multiplier performs integer multiplication 
operation 
in a 
variable 
amount of time, depending 
on the number of significant 
bits. 
It is used for integer 
multiplications 
and several floating-point 
calculations. 


The following 
section 
describes 
the execution 
times that can be expected 
for the various 
instructions 
in the 80960KB processor. 
As illustrated in the previous sections of this appendix, 


the execution 
time for each instruction 
can vary considerably, 
for two reasons. 
First, many 
instructions 
can vary in execution 
time, depending 
on their arguments 
and the state of the 


on-chip resources 
being used. 
Second, by taking advantage 
of pipelining 
and overlapping 
of 


operations, 
a program can be written in which some instructions. 
in effect, take no clock cycles 


to execute. 


In the following discussion of instruction timing, the execution time of an instruction 
is defined 


as the time between the beginning of actual execution 
of a decoded instruction 
and the begin- 


ning of execution for the next decoded instruction. 
For example, the illustration 
in Figure C-2 


shows the execution 
time of a two operand 
instruction 
to be two clocks, with respect to the 


next instruction to be executed. 


DECODE 
I 
EXECUTE 
. 
src1 
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The following paragraphs 
show the instruction 
times for the instructions 
defined in the 80960 
architecture. 


The timing of the logical instructions 
depends on the IEU bypass mechanism 
described earlier 
in this appendix, in particular for any instruction of the form: 


alu_instruction 
src I. src2. dst 


If src/ or src2 is a literal or if src2 is the result of the previous operation, a bypass hit occurs. 
Otherwise. 
there is no bypass hit and the instruction 
requires an extra clock to load the second 
operand. 
Table C-2 shows the timing of the logical instructions depending on whether or not a 


bypass hit occurs. 


Note 


In all the following tables, execution 
time is given in number of clock cycles. 


Instruction 
Normal 
Case 
Worst 
Case 
Execution 
Time 
Execution 
Time 
(Bypass Hit) 
(Bypass 
Miss) 


and 
1 
2 


nand 
1 
2 


or 
1 
2 


nor 
1 
2 


xor 
1 
2 


xnor 
1 
2 


andnot 
1 
2 
i 


notand 
1 
c 
2 


not 
1 
1 


notor 
1 
2 


ornot 
1 
2 


rotate 
1 
2 


shlo 
1 
. 
2 


shro 
1 
2 


shli 
2 
3 


shri 
2 
3 


shrdi 
2 
3 


The execution 
times for the bit instructions 
are also dependent 
on whether 
or not a register 
bypass has occurred or not, as is shown in Table C-3. 


Table C-3: 
Bit Instruction 
Timing 


Instruction 
Normal 
Case 
Worst 
Case 
Execution 
Time 
Execution 
Time 
(Bypass Hit) 
(Bypass Miss) 


notbit 
2 
3 


setbit 
2 
3 


clrbit 
2 
3 


alterbit 
2 
3 


chkbit 
2 
3 


extract 
7 
7 


modify 
8 
8 
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The execution times of the scanbit and spanbit instructions (shown in Table C-4 depend on 
condition code scoreboarding. 
If the condition code is not set by the previous instruction 
execution, the instruction will complete in one less clock cycle. Execution time is also depend- 
ent on the number of bits operated upon. 


Table C-4: 
Scan and Span Bit Instruction Timing 


Instruction 
Best Case 
Normal Case 
Worst Case 
Execution Time 
Execution Time 
Execution Time 
scanbit 
8 
11 
14 
span bit 
8 
11 
14 


The timing of instructions that move data between registers is directly related to the number of 
words moved. One clock cycle is required to move one (as shown in Table C-5). 


Table C-5: 
Move Instruction Timing 


Instruction 
Execution Time 
mov 
1 
movl 
2 
movt 
3 
movq 
4 


The execution times for the basic add, subtract, and comparison instructions (as shown in 
Table C-6) depend on register bypass. The normal-case results are achieved when a register 
bypass occurs. 
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Instruction 
Normal Case 
Worst Case 
Execution Time 
Execution Time 
(Bypass Hit) 
(Bypass Miss) 


ad do 
I 
2 


addi 
I 
2 


subo 
I 
2 


subi 
I 
2 


cmpo 
I 
2 


cmpi 
I 
2 


cmpinco 
2 
3 


cmpdeco 
2 
3 


cmpinci 
2 
3 


cmpdeci 
2 
3 


The execution 
times for the add and subtract with carry and conditional 
compare instructions 
(shown in Table C-7) depend on condition 
code scoreboarding. 
If the instruction 
executed 
prior to any of these instructions 
sets the condition 
code (CC), the worst case instruction 
execution 
time occurs; if an instruction 
is inserted between 
the instruction 
that sets the con- 
dition code and one of the instructions 
listed in Table C-7, the instruction 
is executed 
in the 
normal case time. 


Instruction 
Normal Case 
Worst Case 
Execution Time 
Execution Time 
(CC Available) 
(CC Not Available) 


addc 
I 
2 


subc 
I 
2 


subi 
I 
2 


concmpi 
I 
2 


Table C-8 shows the typical instruction 
execution 
times for the multiply 
and divide instruc- 
tions: 


inter 


Instruction 
Range of 
Typical Case 
Significant Bits 
Execution Time 


mulo 
9 to 21 
18 


muli 
9 to 21 
18 


divi 
37 
37 


divo 
37 
37 


remo 
37 
37 


remi 
37 
37 


modi 
37 
37 


ernul 
37 
24 


ediv 
37 
40 


Since the processor 
contains 
a Booth Multiplier 
with early out, the execution 
times on the 


multiply and divide instructions 
(shown in Table C-8) depend on the number of significant bits 


in the srcl operand. 
For example, Table C-9 shows the execution 
times based on the number 


of significant bits in srcl: 


Table C-9: 
Multiply/Divide 
Execution Times Based on Significant 
Bits 


Srcl Significant 
Bits 
Execution Time 


2 
9 


4 
10 


8 
11 


32 
21 


Note that the shift instructions 
or the add and subtract 
instructions 
may be faster than the 


multiply instructions 
in certain instances (for example, when multiplying 
by 3, 5, 15, etc.). 


Branch instructions 
are executed directly by the ID and do not require lEU or FPU resources. 
Because of this, branch instructions 
can in most cases be programmed 
so that their execution is 


overlapped 
with other operations. 
Table C-I0 lists the ranges of times for execution of branch 


instructions, 
from best (maximum 
overlap) to worst (no overlap). 
(The instructions 
in capital 


letters indicate groups of instructions 
that branch on condition 
codes, such the BRANCH 
IF 
instructions, 
be, bg, bl, etc.) 
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Instruction 
Best Case 
Worst 
Case 
Execution 
Time 
Execution 
Time 
(CC Available) 
(CC Not Available) 


b 
o to 2 (0 to 2) 
o to 2 (0 to 2) 


BRANCH IF 
Ot02(Otol) 
o to 3 (0 to 2) 


bx 
o to 6 (0 to 6) 
o to 6 (0 to 6) 


BRANCH AND 
2 to 8 (2 to 8) 
2 to 8 (2 to 8) 
LINK 


COMPARE 
AND 
3 to 5 (3 to 4) 
3 to 5 (3 to 4) 
BRANCH 


TEST IF 
o to 3 (0 to 2) 
o to 4 (0 to 3) 


FAULT IF 
o to 2 (0 to 1) 
o to 3 (0 to 2) 


The second column of numbers lists execution-time 
ranges for conditional 
branches in which 
the condition code was not set in the previous instruction, 
and the third column lists ranges for 
branches in which the condition code was set by the previous instruction. 
Also, the first range 
in each column is for the case in which the branch is taken, and the range in parentheses 
is for 
the case in which the branch is not taken. 


When writing optimized code for the 80960KB processor, it is best to perform conditional 
tests 
at least one instruction before a conditional 
branch. 
This practice allows the execution times in 
column 
two to be achieved. 
It is also important 
to note that the "not taken" 
branch case 
executes in one less cycle, because there is no break in the pipeline. 
(Remember, 
instruction 
time is defined 
as the time from the start of execution 
of one instruction 
to the start of 
execution of the next instruction. 
If the pipeline is stalled, the fetch of the next instruction 
will 
be delayed one clock. 
This delay mayor 
may not be hidden by the parallelism 
of the 80960KB 
processor). 


As described 
earlier 
in this appendix, 
the 80960KB 
processor 
provides 
four sets of local 
registers. 
When 
a call instruction 
is executed, 
the processor 
allocates 
a new set of local 
registers 
to the called procedure 
or interrupt 
routine. 
If, when a call or calix instruction 
is 
executed, 
a set of local registers is available, the processor 
executes the instruction 
in 9 clock 
cycles. 


If a set of local registers is not available, the processor flushes the oldest set of registers to the 
stack in memory 
to free up a register 
set. 
Flushing 
a set of local registers 
requires 
four 
quad-word 
stores to memory. 
Assuming zero-wait-state 
memory, this operation adds 24 clocks 
to the 9 clocks normally required to execute a call. 


The ret 
(return) 
instruction 
normally 
requires 
7 clock cycles. 
If the local registers 
being 
returned 
to have been flushed 
to the stack, an additional 
24 clocks 
must be added to this 
execution 
time (with zero-wait-state 
memory) 
for the processor 
to reload the local registers 
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from the stack. 
It is important to note that the processor 
only reloads the local registers when 
they are required, thus eliminating 
unnecessary 
memory cycles. 


A load instruction requires the following steps: 


1. 
Instruction Fetch 


2. 
Decode 


3. 
Compute Effective Address/Scoreboard 
Register(s) 


4. 
Place Address on Bus 


5. 
Wait State(s) 


6. 
Receive Data on Bus 


7. 
Place Data in target register 


Of these steps, only steps 3 through 7 are included in the definition 
of execution 
time for an 
instruction. 
The following figures show several examples of load instruction timing depending 
on where the load instruction is placed in the instruction stream. 


The example 
in Figure 
C-3 illustrates 
a load instruction 
where the instruction 
that follows 
requires the fetched data. 
Here, the pipeline is stalled while the processor 
waits for the load to 
complete. 
Assuming a one-clock-cycle 
effective-address 
calculation, 
the load will require 4 or 
5 clock cycles to be executed, depending on whether or not zero-wait-state 
memory is used. 


PREVIOUS 
INSTRUCTION 
I 
DECODE I EXECUTE 


Id INSTRUCTION 


INSTRUCTION 
USING 
Id RESULT 
I 
FETCH I 
DECODE I EXEC~TE 
RESULT I 


Figure C-4 gives an example of a load instruction 
where the instruction 
that follows does not 
require the data being fetched from memory. 
Here, the unrelated instruction 
can be executed 
while the load is being completed. 
The 2 clock cycles 
required 
to execute 
the unrelated 
instruction 
are then overlapped 
with the 4 or 5 cycles 
required 
to execute 
the load (again 
depending 
on whether 
or not zero-wait-state 
memory 
is used). 
The load instruction 
thus 
requires a net of I or 2 clock cycles from the pipeline to be executed. 


PREVIOUS INSTRUCTION 
I 
DECODE I EXECUTE 


Id INSTRUCTION 


~ 
~ 
EXECUTION TIME 


UNRELATED 
INSTRUCTION 
I 
FETCH I 
DECODE I EXECUTE I 
RESULT I 


Figure C-4: 
Load Where the Next Instruction 
Does Not Require the Fetched Data 


Finally, Figure C-5 shows an example of two load instructions 
being executed 
back-to-back. 
These two instructions 
can be executed in 5 or 6 clock cycles, as long as the number of BCL 
requests 
is limited to 3 or less (which is the size of the output request FIFO in the BCL's 
control queue). 
Here, the second load is almost completely 
overlapped by the first load. Times 
for multiple word loads will be lengthened 
I cycle plus wait states for each additional word. 
If 
more than 3 requests become outstanding, 
the processor will wait until the number of outstand- 
ing load operations goes below the size of the output FIFO. 
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Store instructions 
involve a posting of an address and data request to the BCL and are usually 


executed in 2 to 3 clock cycles. 
(They do not require register scoreboarding.) 
If the instruction 


following 
a store instruction 
is another store instruction, 
the second store instruction 
is usually 


executed in 2 clock cycles. 
If the following 
instruction 
uses the lEU, the execution 
time is 3 
clock cycles. 
The only case in which this time will increase is when the three-request 
output 


FIFO in the BCL becomes full. 
Here, if another store instruction is issued, the processor 
waits 


for the BCL to complete its operations before other instructions can execute. 


The following 
paragraphs 
show the execution 
times for those 80960KB 
instructions 
that are 


extensions to the 80960 architecture. 


Instruction 
Execution Time 


dmovt 
7 


daddc 
8 


dsubc 
8 


Table C-12 shows the instruction 
execution 
times for the simple floating-point 
instructions. 


Where applicable, a range and a typical observed average are given. 


Instruction 
Execution 
Time 


movr 
5 


mQvrl 
5 to 7 


movre 
7 to 8 


cpysre 
8 


cpyrsre 
8 


addr 
9 to 17 (typical 10) 


addrl 
12 to 20 (typical 13) 


subr 
, 
9 to 17 (typical 10) 


subrl 
, 
. 
12 to 20 (typical 13) 


mulr 
11 to 22 (typical 20) 


mulrl 
14 to 43 (typical 36) 


divr 
35 


divrl 
77 


cmpr 
10 
, 


cmprl 
12 


cmpor 
10 


cmporl 
12 


cvtri 
25 to 33 


cvtril 
26 to 35 


cvtilr 
41 to 45 


cvtilr 
42 to 46 


cvtzri 
41 to 45 


cvtzril 
42 to 46 


roundr 
56 to 69 


roundrl 
56 to 70 


scaler 
28 


scalerl 
30 


logbnr 
32 to 41 


logbnrl 
32 to 43 


The instructions 
given in Table C-13 consist of the complex floating point instruCtions. 
Only 
typical instruction 
execution 
rates are given here. 
In many cases, the clock count can vary by 
30-40%. 
Execution time is dependent on the operands. 


Instruction 
Execution Time 


sqrtrl 
104 


expr 
300 


exprl 
334 


logepr 
400 


logeprl 
420 


logr 
438 


logrl 
438 


remr 
(67 to 75878) 


remrl 
(67 to 75878) 


atanr 
267 


atanrl 
350 


cosr 
406 


cosrl 
441 


tanr 
293 


tanrl 
323 


It is important to note that these floating-point 
instructions 
are interruptible. 
When an interrupt 
is received while one of these instructions 
is being executed, the processor can suspend execu- 


tion, service the external request, then resume execution of the instruction. 


APPENDIX D 
INITIALIZATION 
CODE 


This appendix provides an example of the initialization 
code required to initialize the 80960KB 
processor. 


The code given in this appendix demonstrates 
one of the methods that can be used to initialize 


the 80960KB processor. 
To use this code, the programmer 
must assemble (and compile, in the 


case of the C program modules) the individual files into object modules. 
These modules must 
then be loaded into ROM (generally EPROM). 
The resulting EPROM will contain an IMI (as 
shown in Figure 7-3; an interrupt table; a fault table; and a system procedure 
table; a set of 
dummy 
interrupt 
and fault handler 
routines; 
and a set of dummy 
system procedures. 
(The 


dummy interrupt and fault handler routines merely perform a return to the initialization 
code if 


an interrupt 
or fault occurs 
during 
initialization. 
Likewise, 
the dummy 
system procedures 
perform returns. 
These routines may be changed to suit the needs of a particular application.) 


When the RESET 
pin on the processor 
is asserted, 
the processor 
performs 
its self test, then 
begins executing the initialization 
code. 
This code directs the processor to perform the follow- 


ing rudimentary 
steps of initialization: 


1. 
Copy the PRCB from the IMI into RAM. 


2. 
Copy the interrupt table into RAM. 


3. 
Execute a reinitialize 
processor 
lAC, to enable the processor 
to load the new pointers to 
the PRCB and interrupt table. 


The PRCB and interrupt table are copied into RAM because both of these data structures have 
fields that the processor must be able to write. 


Once these first steps of initialization 
have been completed, 
the processor 
is able to execute 
additional 
initialization 
steps to configure 
the processor 
for a particular 
application. 
The 
following 
items are examples 
of further initialization 
actions that might be included 
in the 
initialization 
code: 


• 
Copy new interrupt handler routines 
into RAM and change the pointers 
in the interrupt 
table to point to these new routines. 


• 
Copy the fault table into RAM; copy new fault handler routines 
into RAM; change the 


pointers in the fault table to point to the new fault handler routines; and change the pointer 
in the PRCB to point to the relocated fault table. 


• 
Create a new system procedure 
table in RAM; copy the system procedures 
into RAM; 
change the pointer in the PRCB to point to the new system procedure table. 


Alternatively, 
the interrupt handler routines, fault handler routines, and system procedures 
can 


all be loaded into ROM. 
Here, execution of an application 
program can begin directly follow- 
ing the reinitialization 
of the processor. 


• 
example.lst 
• 
Ctable.lst 
• 
Uable.lst 
• 
Chandler.c 
• 
i_handler.c 
• 
cold.ld 


The first three files are listings from the Intel 80960KB 
Assembler. 
These listings include 
assembly code (such as would be included in an ".s" file) and the resulting object code. 
The 
fourth and fifth files are C program modules. 
The sixth file is a load module. 


The following steps describe how to use the code in these files: 


1. 
Assemble the assembly code in files example.s,Ltable.s, 
and i_table.s. 
(Here the ".s" files 


are made up of the assembly code only from the" .lst" files listed above.) 


2. 
Compile the C code in filesLhandler.c 
and i_handler.c. 


3. 
Link the object modules 
(example.o, Ltable.o, 
i_table.o, Lhandler.o, 
and i_handler.o), 


using the 80960 Linker and the script in the cold.ld file. 
The script in cold.ld directs the 
linker to locate the linked code at address O. 


4. 
Bum the output file from the linker in an EPROM . 


1 0000 
2 
0000 
3 0000 
4 0000 
5 0000 
6 0000 
7 0000 
8 0000 
9 0000 


10 
0000 
11 
0000 
12 
0000 
13 
0000 


14 
0000 


15 
0000 
16 
0000 
17 
0000 
18 
0000 


•••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••• 
### ••• 
•t 
Below 
is 
example 
system 
initialization 
code 
and 
tables. 
t 
The 
code 
builds 
the 
prcb 
in 
memory, 
sets 
up 
the 
stack 
frame, 
t 
the 
interrupt, 
fault, 
and 
system 
procedure 
tables, 
and 
t 
then 
vectors 
to 
a 
user 
defined 
routine. 
•""""""""","""",,""""""""""","I 
•••••ili'•••• 


.globl 
system_address_table 


.globl prcbytr 
.globl 
start_ip 


.globl 
csl 


inter 


19 
0000 
20 
0000 
21 
0000 
22 
0000 
23 
0000 
24 
0000 
25 
0000 
26 
0000 
27 
0000 
28 
0000 
29 
0000 
30 
0004 
31 
0008 
32 
OOOe 
33 
0010 
34 
0014 
35 
0018 
36 
OOIe 
37 
001e 
38 
001e 
39 
001e 
40 
001e 
41 
001e 
42 
001e 
43 
0020 
44 
0020 
45 
0024 
46 
0028 
47 
002e 
48 
0030 
49 
0034 
50 
0038 
51 
003e 
52 
0040 
53 
0044 
54 
0048 
55 
004e 
56 
0050 
57 
005e 
58 
0060 
59 
0068 
60 
006e 
61 
0070 
62 
OOaO 
63 
OOaO 
64 
OOaO 
65 
OOaO 
66 
OOaO 
67 
OOaO 
68 
OOee 
69 
0100 
70 
0100 
71 
0104 
72 
0108 
73 
010e 
74 
0110 
75 
0114 
76 
0118 
77 
011e 
78 
0120 
79 
0124 
80 
0128 
81 
012e 
82 
0130 
83 
0134 
84 
0134 
85 
0134 
86 
0134 
87 
0134 
88 
0138 
89 
0140 
90 
0140 
91 
0140 
92 
01e8 
93 
01ee 
94 
01dO 
95 
01dO 


00000140 
00000020 
00000000 
000001e8 
00000000 
00000000 
00000000 
ffffffff 


.text 
.word 
.word 
.word 
.word 
.word 
.word 
.word 
.word 


t 
Pointer 
to 
first 
IP 
calculated 
at 
link 
time 
csl 
= 
- (segtab 
+ 
PReB + 
startup) 


system_address_table 
prcbytr 
o 
start_ip 
es1 
o 
o 
-1 


t 
initial 
PRCB 
t, 
This 
is 
our 
startup 
PRCB. 
After 
initialization, 
this 
will 
# 
Be 
copied 
to 
RAM 
prcbytr: 


00000000 
.word 
OxO 
0 - 
reserved 
00000000 
.word 
OxO 
4 - 
initialize 
to 
0 
00000000 
.word 
OxO 
8 - 
reserved 
00000000 
.word 
OxO 
12 - reserved 
00000000 
.word 
OxO 
16 - reserved 
00000000 
.word 
iotr -table 
20 - interrupt 
table 
address 
00000f50 
.word 
iotr-stack 
24 - 
interrupt 
stack 
pointer 
00000000 
.word 
OxO 
28 - reserved 
0000027f 
.word 
OxOOOO027f 
32 - 
0000027f 
.word 
OxOOOO027f 
36 - 
00000000 
.word 
fault 
table 
40 - 
fault 
table 
00000000 
.word 
OxO 
44 - reserved 
.space 
12 
48 - reserved 
00000000 
.word 
OxO 
60 - reserved 


.space 
8 
64 - reserved 
00000000 
.word 
OxO 
72 - reserved 
00000000 
.word 
OxO 
76 - reserved 
. space 
48 
80 - scratch 
space 
(resumption) 
. space 
44 
128 
- scratch 
space 
( error) 


00000000 
.word 
0 
Reserved 
0 
00000000 
.word 
0 
Reserved 
4 
00000000 
.word 
0 
Reserved 
8 
00001150 
.word 
sup_ stack 
Supervisor 
stack 
pointer 
12 
00000000 
.word 
0 
Preserved 
00000000 
.word 
0 
Preserved 
00000000 
..word 
0 
Preserved 
00000000 
.word 
0 
Preserved 
00000000 
.word 
0 
Preserved 
00000000 
.word 
0 
Preserved 
00000000 
.word 
0 
Preserved 
00000000 
.word 
0 
Preserved 
000001eO 
.word 
proc_entry_ 
0 
• Procedure 
entry 
(user) 
000001e6 
.word 
(proc_entry_ 
1 
+ 
Ox2) 
• Procedure 
entry 
(sup. ) 


.align 
6 
system_address_table: 
.space 
136 


.word 
system 
address 
table 
.word 
OxOOfcOOfb 
- 
.space 
8 


iny" 


96 
01d8 
97 
01de 
98 
01de 
99 
01de 
100 
01de 
101 
01de 
102 
01de 
103 
OleO 
104 
OleO 
105 
OleO 
106 
OleO 
107 
01e4 
108 
01e4 


109 
01e4 


110 
01e4 
111 
01e4 
112 
01e8 
113 
01e8 
114 
01e8 
115 
01e8 
116 
01e8 
117 
01e8 
118 
01ee 
119 
OHO 
120 
OH8 
121 
0200 
122 
0200 
123 
0200 
124 
0200 
125 
0200 
126 
0204 
127 
0208 
128 
020e 
129 
0214 
130 
021e 
131 
021e 
132 
021e 
133 
021e 
134 
0220 
135 
0228 
136 
0228 
137 
0228 
138 
0228 
139 
0228 
140 
0228 
141 
0228 
142 
0228 
143 
0228 
144 
0228 
145 
0228 
146 
0228 
147 
0228 
148 
0228 
149 
0228 
150 
022e 
151 
022e 
152 
0234 
153 
023e 
154 
023e 
155 
023e 
156 
023e 
157 
023e 
158 
0240 
159 
0240 
160 
0244 
161 
0248 
162 
024e 
163 
0250 
164 
0250 
165 
0250 
166 
0250 
167 
0250 
168 
0250 


169 
0250 
170 
0254 


171 
0254 
172 
025e 


8e800400 
8eaOOOOO 
8e883000 
00000000 
8e903000 
00000290 
000040 
Ob 


8e8000bO 
8eaOOOOO 
8e883000 
00000020 
8e903000 
00000690 
000024 
Ob 


8ea83000 
ff000010 
8eb03000 
00000280 
6005a115 


bOe45e14 
b2e4 ge14 
59a41094 
3985Hf4 
84079000 


-- 
Below 
are 
two 


-- 
would 
contain 


.align 
4 


.text 
proc_entry_O: 


ret 
proc_entry_l: 
ret 


These 
pointers 
are 
to 
dummy 


# 
supervisor 
routines. 
They 


are 
for 
example 
only 


1024, 
gO 
0, 
g4 


intr 
table, 
91 
intr=:ram, 
92 
loop_here 


load 
length 
of 
into 
table 
initialize 
offset 
to 
a 
load 
source 
load 
addrss 
of 
new 
table 
branch 
to 
move 
routine 


176, 
gO 
0, 
g4 


prcbytr, 
g1 
prcb_ram, 
92 
loop_here 


load 
length 
of 
prcb 
initialize 
offset 
to 
load 
source 
load 
destination 
branch 
to 
move 
routine 


At 
this 
point, 
the 
prcb, 
and 
interrupt 
table 
have 
been 
moved 
to 
RAM. 
It 
is 
time 
to 
issue 
a 
REINITIALIZE 
lAC, 
which 
will 
start 
us 
anew 
with 
our 
RAM 
based 
prcb. 


The 
lAC 
message, 
found 
in 
the 
4 words 
locatad 
at 
the 


reinitialize 
iac 
label, 
contain 
pointers 
to 
the 
current 


System 
address 
table, 
the 
new, 
RAM 
based 
PRCB, 
and 
to 
the 
instruction 
pointer 
labeled 
start_agaln 
lp 


lda 
local 
lAC, 
g5 
lda 
reinitialize_iac, 
g6 
synmovq 
g5, 
g6 


t 
# 
Below 
is 
the 
software 
loop 
to 
move 
data 


t 
loop_here: 


1dq 
stq 
addi 
cmpibg 
bx 


(gl) [g4*l], 
g8 
g8, 
(g2) [g4*1) 
g4,16, 
g4 
t 
gO,g4, 
loop_here 
(g141 


load 
4 words 
into 
98 
store 
to 
ram 
proc. 
block 
increment 
index 
f 
loop 
until 
done 


start_again 
_ ip: 
1da 
1da 


inter 


173 
0264 
174 
0264 
175 
0264 
176 
0268 
177 
0268 
178 
0268 
179 
0268 
180 
0268 
181 
026c 
182 
0274 
183 
0274 
184 
0274 


185 
0274 
186 
0274 
187 
0274 
188 
0274 
189 
0274 
190 
0274 
191 
0274 
192 
0274 
193 
0278 
194 
0278 
195 
0278 
196 
0278 
197 
0280 
198 
0280 
199 
0284 
200 
0288 
201 
028c 
202 
028c 
203 
028c 
204 
028c 


205 
0290 
206 
0290 
207 
0290 
208 
0290 
209 
0290 
210 
0290 
211 
0290 
212 
0690 
213 
0690 
214 
0690 
215 
0740 
216 
0740 
217 
0750 
218 
0750 
219 
0750 
220 
0750 
221 
0750 
222 
Of 50 
223 
Of 50 
224 
Of 50 
225 
Of 50 
226 
Of 50 
227 
1150 
228 
1150 
229 
1150 
230 
1150 
231 
1150 
232 
1150 


93000000 
00000140 
00000690 
00000254 


g14 
used 
by 
C 
compiler 
for 
arguement 
lists 
past 
13 
arguements. 


Initialize 
to 
0 


set 
up 
arith. 
controls 
to 
mask 
unwanted 
exceptions 


# 
# 
call 
main 
code 
from 
here 
# 
# 
Note: 
This 
setup 
assumes 
a main 
module 
"main()" 
written 
in 
t 
C. 
Also, 
no 
opens 
are 
done 
for 
stdin, 
stdout, 
or 
stderr. 
, 
If 
I/O 
is 
required, 
the 
devices 
would 
need 
to 
be 
opened 
t 
before 
the 
call 
to 
main. 


reinitialize 
lac: 


.;ord 
Ox93000000 
• 
reinitialize 
iac 
message 
.word 
system 
address 
table 


.word 
prcb 
r;m 
- 
I use 
newly 
copied 
prcb 
.word 
start_again_ip 
t 
start 
here 


. align 
intr_ram: 


.space 
1024 


user_stack: 
i 
reserved 
area 
for 
the 
user 
stack 


# 
this 
can 
be 
located 
anywhere 
in 
memory 


# 
Size 
is 
set 
depending 
on 
application 
needs 
.space 
Ox800 


intr 
stack: 
t 
reserved 
area 
for 
the 
interrupt 
stack 
- 
# 
this 
can 
be 
located 
anywhere 
in 
memory 
.space 
Ox200 


inter 
80960KB 
PROGRAMMER'S 
REFERENCE 


Ctable.lst 


1 
0000 
I' 
* ** *.* .•.** **.,. ..•.•.•.•.* * *" .•.••. ,...•..•.•..•.•.•* .•.•.•.• ~ * .""",*,.. .•.* * .•..•* *,..* * **** * .•* 'I 
2 0000 
I' 
User 
Fault 
Table 
'I 
3 
0000 
. glob1 
fault -table 
4 
0000 
.align 
8 
5 
0000 
fault 
table: 


6 0000 
00000000 
.w;rd 
-user 
reserved 
Type a 
Reserved 
Fault 
Handler 
7 
0004 
00000000 
.word 
4 
8 
0008 
00000000 
.word 
user_trace; 
8 
9 
OOOe 
00000000 
.word 
0 
10 
0010 
00000000 
.word 
_user_operation; 


11 
0014 
00000000 
.word 
0 
# 
12 
0018 
00000000 
.word 
user_arithmetic: 


13 
OOlc 
00000000 
.word 
0 
# 


14 
0020 
00000000 
.word 
user_real_arithmetic; 


15 
0024 
00000000 
.word 
0 
# 
16 
0028 
00000000 
.word 
user_constraint; 
1/ 
002c 
00000000 
.word 
0 
# 
1B 0030 
00000000 
.word 
-user_reserved 
• Type 
6 Reserved 
au 
• 
f!'indler 
19 
0034 
00000000 
.word 
20 
0038 
00000000 
.word 
_user-protectioo: 
21 
003e 
00000000 
.word 
0 
# 
22 
0040 
00000000 
.word 
user_machine; 
# 


23 
0044 
00000000 
.word 
0 
# 
24 
0048 
00000000 
.word 
user_reserved; 
l5 
004c 
00000000 
.word 
0 
26 
0050 
00000000 
.word 
_user 
type: 


27 
0054 
00000000 
.word 
0 
# 
28 
0058 
00000000 
.word 
user_reserved 
• Type 
11 
Reserved 
Fault 
Handler 
29 
005e 
00000000 
.word 
0 
# 
30 
0060 
00000000 
.word 
-user_reserved 
# 
Type 
12 
Reserved 
Fault 
Handler 
31 
0064 
00000000 
.word 
0 
# 


32 
0068 
00000000 
.word 
-user_reserved 
# 
Type 
13 
Reserved 
Fault 
Handler 
33 
006e 
00000000 
.word 
# 
34 
0070 
00000000 
.word 
-user_reserved 
I 
Type 
14 
Reserved 
Fault 
Handler 
35 
0074 
00000000 
.word 
0 
# 
36 
0078 
00000000 
.word 
user_reserved 
# 
Type 
15 
Reserved 
Fault 
Handler 
37 
007e 
00000000 
.word 
0 
t 
38 
0080 
00000000 
.word 
-user_reserved 
# 
Type 
16 
Reserved 
Fault 
Handler 
39 
0084 
00000000 
.word 
# 


40 
0088 
00000000 
.word 
-user_reserved 
# 
Type 
17 
Reserved 
Fault 
Handler 
41 
008e 
00000000 
.word 
t 
42 
0090 
00000000 
.word 
-user_reserved 
# 
Type 
18 
Reserved 
Fault 
Handler 
43 
0094 
00000000 
.word 
# 
44 
0098 
00000000 
.word 
-user -reserved 
# 
Type 
19 
Reserved 
Fau't 
Handler 
45 
00ge 
00000000 
.word 
I 
46 
OOaO 
00000000 
.word 
-user_reserved 
t 
Type 
20 
Reserved 
Fault 
Handler 
47 
00a4 
00000000 
.word 
# 
48 
OOaS 
00000000 
.word 
-user_reserved 
# 
Type 
21 
Reserved 
Fault 
Handler 
49 
OOae 
00000000 
.word 
# 


50 
OObO 
00000000 
.word 
user_reserved 
# 
Type 
22 
Reserved 
Fault 
Handler 
51 
00b4 
00000000 
..... 
'ord 
0 
, 


52 
00b8 
00000000 
.word 
_user_reserved 
I 
Type 
23 
Reserved 
Fault 
Handler 
53 
OObe 
00000000 
.word 
I 
54 
OOeO 
00000000 
.word 
-user_reserved 
# 
Type 
24 
Reserved 
Fault 
Handler 
55 
OOe4 
00000000 
.word 
0 
# 
56 
OOeS 
00000000 
.word 
_user_reserved 
# 
Type 
25 
Reserved 
Fault 
Handler 
57 
OOee 
00000000 
.word 
0 
I 


58 
OOdO 
00000000 
.word 
-user -reserved 
# 
Type 
26 
Reserved 
Fault 
Handler 
59 
00d4 
00000000 
.word 
# 
60 
00d8 
00000000 
.word 
-user -reserved 
t 
Type 
27 
Reserved 
Fault 
Hand.;.-er 
61 
OOde 
00000000 
.word 
I 
62 
ODeD 
00000000 
.word 
-user~reserved 
# 
Type 
28 
Reserved 
Fault 
Handler 
63 
00e4 
00000000 
.word 
I 
64 
00e8 
00000000 
.word 
user_reserved 
# 
Type 
29 
Reserved 
Fault 
Handler 
65 
OOec 
00000000 
.word 
0 
# 
66 
OOfO 
00000000 
.word 
-user 
reserved 
# 
Type 
30 
Reserved 
Fault 
Handler 
67 
00f4 
00000000 
.word 
# 


68 
00f8 
00000000 
.word 
-user_reserved 
# 
Type 
31 
Reserved 
Fault 
Handler 
69 
OOfe 
00000000 
.word 
# 


intel" 
80960KB 
PROGRAMMER'S 
REFERENCE 


i_table.lst 


1 0000 
/* 
Initial 
Interrupt 
Table 
*/ 
2 
0000 
.globl 
intr-table 
3 0000 
.align 
6 
4 
0000 
intr 
table: 


5 
0000 
00000000 
.word 
0 
, Pending 
Priorities 
0 
6 0004 
. fill 
8,4,0 
, pending 
Interrupts 
4 
+ 
(0->7) 
*4 
7 
0024 
00000000 
.word 
-user -iocrh; 
, interrupt 
table 
entry 
8 
8 
0028 
00000000 
.word 
_user -iocrh; 
, interrupt 
table 
entry 
9 
9 002e 
00000000 
.word 
_user -iotrh; 
, interrupt 
table 
entry 
10 
10 
0030 
00000000 
.word 
-user -iotrhi 
# 
interrupt 
table 
entry 
11 
11 
0034 
00000000 
.word 
-user 
iocrh; 
# 
interrupt 
table 
entry 
12 
12 
0038 
00000000 
.word 
-user_iocrh; 
# 
interrupt 
table 
entry 
13 
13 
003e 
00000000 
.word 
-user -iocrh; 
# 
interrupt 
table 
entry 
14 
14 
0040 
00000000 
.word 
-user -iotrhi 
, interrupt 
table 
entry 
15 
15 
0044 
00000000 
.word 
user -iocrh; 
# 
interrupt 
table 
entry 
16 
16 
0048 
00000000 
.word 
_user -iotrh; 
, interrupt 
table 
entry 
11 


11 
004e 
00000000 
.word 
user -iocrh; 
, interrupt 
table 
entry 
18 
18 
0050 
00000000 
.word 
~user_ 
iocrh; 
# 
interrupt 
table 
entry 
19 
19 
0054 
00000000 
.word 
-user -iocrh; 
, interrupt 
table 
entry 
20 
20 
0058 
00000000 
.word 
-user -intrhi 
# 
interrupt 
table 
entry 
21 
21 
005e 
00000000 
.wo'rd 
-user -intrhi 
, interrupt 
table 
entry 
22 
22 
0060 
00000000 
,word 
-user -intrhi 
# 
interrupt 
table 
entry 
23 
23 
0064 
00000000 
.word 
-user -intrhi 
, interrupt 
table 
entry 
24 
24 
0068 
00000000 
,word 
-user -iotrhi 
# 
interrupt 
table 
entry 
25 


25 
006e 
00000000 
.word 
_user -intrhi 
# 
interrupt 
table 
entry 
26 


26 
0070 
00000000 
. word 
_user -intrhi 
# 
interrupt 
table 
entry 
27 


27 
0074 
00000000 
.word 
-user -intrhi 
# 
interrupt 
table 
entry 
28 
28 
0078 
00000000 
,word 
-user -intrhi 
# 
interrupt 
table 
entry 
29 


29 
007c 
00000000 
,word 
user -intrhi 
i 
interrupt 
table 
entry 
30 
30 
0080 
00000000 
.word 
_user -intrhi 
# 
interrupt 
table 
entry 
31 
31 
0084 
00000000 
.word 
-user -intrhi 
i 
interrupt 
table 
entry 
32 
32 
0088 
00000000 
.word 
user -intrhi 
i 
interrupt 
table 
entry 
33 
33 
008e 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
34 


34 
0090 
00000000 
.word 
user -intrh; 
i 
interrupt 
table 
entry 
35 
35 
0094 
00000000 
.word 
-user -intrhi 
i 
interrupt 
table 
entry 
36 
36 
0098 
00000000 
.word 
_user -intrhi 
, interrupt 
table 
entry 
37 
37 
00ge 
00000000 
.word 
-user -intrh: 
j 
interrupt 
table 
entry 
38 
38 
0000 
00000000 
.word 
user -intrh; 
i .interrupt 
table 
entry 
39 
39 
0004 
00000000 
.word 
-user -int-rh: 
i 
interrupt 
table 
entry 
40 
40 
0008 
00000000 
.word 
_user -intrh: 
, interrupt 
table 
entry 
41 
41 
OOae 
00000000 
.word 
_user_ 
intrh; 
t 
interrupt 
table 
entry 
42 
42 
OObO 
00000000 
.word 
-user -intrh; 
, interrupt 
table 
entry 
43 
43 
00b4 
00000000 
.word 
_user -intrhi 
i 
interrupt 
table 
entry 
44 
44 
00b8 
00000000 
.word 
-user -intrh; 
, interrupt 
table 
entry 
45 
45 
OObe 
00000000 
,word 
user -intrh: 
, interrupt 
table 
entry 
46 
46 
OOeO 
00000000 
.word 
-user -iotrh: 
, interrupt 
table 
entry 
47 
47 
00e4 
00000000 
.word 
,-user_ intrhi 
i 
interrupt 
table 
entry 
48 
48 
00e8 
00000000 
,word 
-user -iotrhi 
# 
interrupt 
table 
entry 
49 
49 
OOee 
00000000 
.word 
-user -intrh: 
, interrupt 
table 
entry 
50 


50 
OOdO 
00000000 
.word 
-user -intrhi 
i 
interrupt 
table 
entry 
51 


51 
00d4 
00000000 
.word 
-user_ 
intrh; 
, interrupt 
table 
entry 
52 
52 
00d8 
00000000 
.word 
-user -intrhi 
i 
~nterrupt 
table 
entry 
53 
53 
OOde 
00000000 
.word 
-user -intrhi 
, interrupt 
table 
entry 
54 
54 
OOeO 
00000000 
.word 
-user -intrhi 
, interrupt 
table 
entry 
55 
55 
00e4 
00000000 
.word 
-user -intrhi 
, interrupt 
table 
entry 
56 
56 
00e8 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
57 
57 
OOec 
00000000 
.word 
-user -intrh; 
, interrupt 
table 
entry 
58 
58 
OOfO 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
59 
59 
00f4 
00000000 
.word 
-user -intrh; 
, interrupt 
table 
entry 
60 
60 
00f8 
00000000 
,word 
-user -intrhi 
# 
interrupt 
table 
entry 
61 


61 
OOfe 
00000000 
.word 
-user -intrhi 
, interrupt 
table 
entry 
62 
62 
0100 
00000000 
.word 
_user -intrhi 
# 
interrupt 
table 
entry 
63 
63 
0104 
00000000 
.word 
-user -intrh; 
, .lnterrupt 
table 
entry 
64 
64 
0108 
00000000 
.word 
-user -intrhi 
, interrupt 
table 
entry 
65 
65 
010e 
00000000 
.word 
-user -intrh; 
, interrupt 
table 
entry 
66 
66 
0110 
00000000 
.word 
user -intrhi 
# 
interrupt 
table 
entry 
67 
67 
0114 
00000000 
.word 
~user -intrh; 
# 
interrupt 
table 
entry 
68 
68 
0118 
00000000 
.word 
-user -intrhi 
# 
interrupt 
table 
entry 
69 
69 
Olle 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
70 
70 
0120 
00000000 
.word 
-user -intrh: 
# 
interrupt 
table 
entry 
71 


71 
0124 
00000000 
.word 
user -intrh: 
# 
interrupt 
table 
entry 
72 
72 
0128 
00000000 
.word 
user 
intrh; 
, interrupt 
table 
entry 
73 
- 
- 
73 
O12c 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
74 


74 
0130 
00000000 
.word 
-user -intrh; 
, interrupt 
table 
entry 
75 
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75 
0134 
00000000 
.word 
user -intrh; 
t 
interrupt 
table 
entry 
76 
- 
intrh; 
t 
interrupt 
table 
77 
76 
0138 
00000000 
.word 
-user - 
entry 
77 
013e 
00000000 
.word 
user -intrh: 
t 
interrupt 
table 
entry 
78 
- 
intrh: 
t 
interrupt 
table 
79 
78 
0140 
00000000 
.word 
user - 
entry 
- 
intrh; 
t 
interrupt 
table 
70 
79 
0144 
00000000 
.word 
-user - 
entry 
80 
0148 
00000000 
.word 
user -intrh; 
t 
interrupt 
table 
entry 
71 
- 
intrh; 
t 
interrupt 
table 
72 
81 
014e 
00000000 
.word 
_user - 
entry 
82 
0150 
00000000 
.word 
user -intrh; 
t 
interrupt 
table 
entry 
73 
- 
83 
0154 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
74 
84 
0158 
00000000 
.word 
user -intrh: 
t 
interrupt 
table 
entry 
75 
- 
85 
015e 
00000000 
.word 
user -intrh: 
i 
interrupt 
table 
entry 
76 
- 
86 
0160 
00000000 
.word 
user -intrh; 
t 
interrupt 
table 
entry 
77 
- 
87 
0164 
00000000 
.word 
-user -intrh; 
, interrupt 
table 
entry 
78 
88 
0168 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
79 
89 
016e 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
80 
90 
0170 
00000000 
.word 
user -intrh: 
t 
interrupt 
table 
entry 
81 
91 
0174 
00000000 
.word 
_user -intrh; 
i 
interrupt 
table 
entry 
82 
92 
0178 
00000000 
.word 
-user -intrh; 
• interrupt 
table 
entry 
83 
93 
017e 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
84 
94 
0180 
00000000 
.word 
-user -intrh: 
i 
interrupt 
table 
entry 
85 
95 
0184 
00000000 
.word 
-user -intrh: 
t 
interrupt 
table 
entry 
86 
96 
0188 
00000000 
.word 
-user -intrh: 
t 
interrupt 
table 
entry 
87 
97 
018e 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
88 
98 
0190 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
89 
99 
0194 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
90 
100 
0198 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
91 
101 
01ge 
00000000 
.word 
-user -intrh: 
t 
interrupt 
table 
entry 
92 
102 
01aO 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
93 
103 
01a4 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
94 
104 
01a8 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
95 
105 
Olae 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
96 
106 
01bO 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
97 
107 
01b4 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
98 
108 
01b8 
00000000 
.word 
_user -intrh; 
i 
interrupt 
table 
entry 
99 
109 
01be 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
100 
110 
OleO 
00000000 
.word 
_user -intrh; 
t 
interrupt 
table 
entry 
101 
111 
01e4 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
102 
112 
01e8 
00000000 
.word 
-user_ intrh: 
t 
interrupt 
table 
entry 
103 
113 
Olee 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
104 
114 
01dO 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
105 
115 
01d4 
00000000 
.word 
_user -intrh: 
t 
interrupt 
table 
entry 
106 
116 
01d8 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
107 
117 
01de 
00000000 
.word 
_user -intrh; 
t 
interrupt 
table 
entry 
108 
118 
OleO 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
109 
119 
01e4 
00000000 
.word 
_user -intrh; 
t 
interrupt 
table 
entry 
110 
120 
01e8 
00000000 
.word 
_user -intrh; 
i 
interrupt 
table 
entry 
111 
121 
Olee 
00000000 
.word 
_user -intrh: 
t 
interrupt 
table 
entry 
112 
122 
OlfO 
00000000 
.word 
_user -intrh: 
i 
interrupt 
table 
entry 
113 
123 
Olf4 
00000000 
.word 
_user -intrh: 
i 
interrupt 
table 
entry 
114 
124 
Olf8 
00000000 
.word 
_user -intrh; 
i 
interrupt 
table 
entry 
115 
125 
Olfe 
00000000 
.word 
_user -intrh: 
i 
interrupt 
table 
entry 
116 
126 
0200 
00000000 
.word 
-user -intrh: 
t 
interrupt 
table 
entry 
117 
127 
0204 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
118 
128 
0208 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
119 
129 
020e 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
120 
130 
0210 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
121 
131 
0214 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
122 
132 
0218 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
123 
133 
021e 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
124 
134 
0220 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
125 
135 
0224 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
126 
136 
0228 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
127 
137 
022e 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
128 
138 
0230 
00000000 
.word 
-user -intrh; 
, interrupt 
table 
entry 
129 
139 
0234 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
130 
140 
0238 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
131 
141 
023e 
00000000 
.word 
-user_ intrh; 
i 
interrupt 
table 
entry 
132 
142 
0240 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
133 
143 
0244 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
134 
144 
0248 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
135 
145 
024e 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
136 
146 
0250 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
137 
147 
0254 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
138 
148 
0258 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
139 
149 
025e 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
140 
150 
0260 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
141 
151 
0264 
00000000 
.word 
-user -intrh; 
t 
interrupt 
table 
entry 
142 
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153 
026e 
00000000 
.word 
-user -intrh; 
~interrupt 
table 
entry 
144 


154 
0270 
00000000 
.word 
-user -intrh; 
~interrupt 
table 
entry 
145 


155 
0274 
00000000 
.word 
-user-intrh; 
jj 
interrupt 
table 
entry 
146 


156 
0278 
00000000 
.word 
-user -intrh; 
~interrupt 
table 
entry 
147 
157 
027e 
00000000 
.word 
user 
intrh; 
~interrupt 
table 
entry 
148 
- 
- 
, 
table 
149 
158 
0280 
00000000 
.word 
_user_intrh: 
jj 
interrupt 
entry 
159 
0284 
00000000 
.worO 
-user -intrh: 
• 
interrupt 
table 
entry 
150 
160 
0288 
00000000 
.word 
_user -intrh; 
• 
interrupt 
table 
entry 
151 
161 
028e 
00000000 
.word 
-user -intrh: 
• interrupt 
table 
entry 
152 


162 
0290 
00000000 
.word 
-user -intrh: 
f 
interrupt 
table 
entry 
153 
163 
0294 
00000000 
.word 
_user -intrh: 
f 
interrupt 
table 
entry 
154 


164 
0298 
00000000 
.word 
_user -intrh: 
f 
interrupt 
table 
entry 
155 
165 
02ge 
00000000 
.word 
_user -intrh; 
• interrupt 
table 
entry 
156 
166 
02aO 
00000000 
.word 
- 


user: -intrh; 
f 
interrupt 
table 
entry 
157 
167 
02a4 
00000000 
.word 
-user -intrh; 
f 
incerrupt 
table 
entry 
158 
168 
02a8 
00000000 
.word 
-user -intrh; 
• 
interrupt 
table 
entry 
159 
169 
02ac 
00000000 
.word 
-user -intrh; 
f 
interrupt 
table 
entry 
160 
170 
02bO 
00000000 
.word 
-user -intrh; 
f 
interrupt 
table 
entry 
161 


171 
02b4 
00000000 
.word 
-user -intrh; 
• 


interrupt 
table 
entry 
162 


172 
02b8 
00000000 
.word 
-user -intrh; 
J 
interrupt 
table 
entry 
163 
173 
02be 
00000000 
.word 
-user -intrh; 
• 
interrupt 
table 
entry 
164 


174 
02eO 
00000000 
.word 
-user -intrh; 
f 
interrupt 
table 
entry 
165 


175 
02e4 
00000000 
.word 
-user -intrh; 
• 
interrupt 
table 
entry 
166 


176 
02e8 
00000000 
.word 
-user -intrh; 
J 
interrupt 
table 
entry 
167 
177 
02cc 
00000000 
.word 
-user -intrh; 
• 
interrupt 
table 
entry 
168 
178 
02dO 
00000000 
.word 
-user -intrh; 
J 
interrupt 
table 
entry 
169 


179 
02d4 
00000000 
.word 
-user -intrh; 
* 
interrupt 
table 
entry 
170 
180 
02d8 
00000000 
.word 
-user -intrh; 
~ interrupt 
table 
entry 
171 
181 
02de 
00000000 
.word 
-user -intrh; 
* 


interrupt 
table 
entry 
172 
182 
02eO 
00000000 
.word 
-user -intrh; 
* 
interrupt 
table 
entry 
173 


183 
02e4 
00000000 
.word 
-user -intrh; 
~interrupt 
table 
entry 
174 
184 
02e8 
00000000 
.word 
-user -intrh; 
• 


interrupt 
table 
entry 
175 


185 
02ec 
00000000 
.word 
-user -intrh; 
• 


interrupt 
table 
entry 
176 
186 
02fO 
00000000 
.word 
-user -intrh; 
• 


interrupt 
table 
entry 
177 
187 
02f4 
00000000 
.word 
-user -intrh; 
• 


interrupt 
table 
entry 
178 


188 
02f8 
00000000 
.word 
-user -intrh; 
• 
interrupt 
table 
entry 
179 
189 
02fe 
00000000 
.word 
_user -intrh; 
• 
interrupt 
table 
entry 
170 
190 
0300 
00000000 
.word 
user -intrh; 
• 
interrupt 
table 
entry 
171 


191 
0304 
00000000 
.word 
-user -intrh; 
• 
interrupt 
table 
entry 
172 


192 
0308 
00000000 
.word 
-user -intrh; 
• 
interrupt 
table 
entry 
173 
193 
030e 
00000000 
.word 
-user -intrh; 
• 
interrupt 
table 
entry 
174 


194 
0310 
00000000 
.word 
_user -intrh; 
f 
interrupt 
table 
entry 
175 
195 
0314 
00000000 
.word 
-user -intrh; 
f 
interrupt 
table 
entry 
176 
196 
0318 
00000000 
.word 
-user -intrh; 
f 
interrupt 
table 
entry 
177 
197 
031e 
00000000 
.word 
-user -intrh; 
f 
interrupt 
table 
entry 
178 
198 
0320 
00000000 
.word 
user -intrh; 
f 
interrupt 
table 
entry 
179 
199 
0324 
00000000 
.word 
-user -intrh: 
f 
interrupt 
table 
entry 
180 
200 
0328 
00000000 
.word 
-user -intrh; 
f 
interrupt 
table 
entry 
181 
201 
032c 
00000000 
.word 
-user -intrh; 
f 
interrupt 
table 
entry 
182 
202 
0330 
00000000 
.word 
-user -intrh; 
f 
interrupt 
table 
entry 
183 
203 
0334 
00000000 
.word 
_user -intrh; 
f 
interrupt 
table 
entry 
184 
204 
0338 
00000000 
.word 
_user -intrh: 
f 
interrupt 
table 
entry 
185 
205 
033e 
00000000 
.word 
-user -intrh; 
• 
interrupt 
table 
entry 
186 
206 
0340 
00000000 
.word 
-user -intrh; 
f 
interrupt 
table 
entry 
187 
207 
0344 
00000000 
.word 
user -intrh; 
f 
interrupt 
table 
entry 
188 
208 
0348 
00000000 
.word 
_user -intrh; 
f 
interrupt 
table 
entry 
189 
209 
034e 
00000000 
.word 
_user -intrh; 
• 


interrupt 
table 
entry 
190 
210 
0350 
00000000 
.word 
_user -intrh; 
• 
interrupt 
table 
entry 
191 
211 
0354 
00000000 
.word 
user -intrh; 
• 
interrupt 
table 
entry 
192 
212 
0358 
00000000 
.word 
-user -intrh: 
f 
interrupt 
table 
entry 
193 
213 
035e 
00000000 
.word 
_user -intrh; 
• 
interrupt 
table 
entry 
194 


214 
0360 
00000000 
.word 
_user -intrh; 
• 
interrupt 
table 
entry 
195 
215 
0364 
00000000 
.word 
-user -intrh: 
• 
interrupt 
table 
entry 
196 


216 
0368 
00000000 
.word 
-user -intrh: 
J 
interrupt 
table 
entry 
197 
217 
036e 
00000000 
.word 
-user -intrh; 
J 
interrupt 
table 
entry 
198 
218 
0370 
00000000 
.word 
-user -intrh: 
J 
interrupt 
table 
entry 
199 
219 
0374 
00000000 
.word 
-user -intrh; 
* 


interrupt 
table 
entry 
200 
220 
0378 
00000000 
.word 
-user -intrh; 
~ interrupt 
table 
entry 
201 


221 
037e 
00000000 
.word 
-user -intrh; 
J 
interrupt 
table 
entry 
202 


222 
0380 
00000000 
.word 
-user -intrh; 
J 
interrupt 
table 
entry 
203 
223 
0384 
00000000 
.word 
-user -intrh; 
• 
interrupt 
table 
entry 
204 


224 
0388 
00000000 
.word 
-user -intrh; 
~interrupt 
table 
entry 
205 
225 
038e 
00000000 
.word 
-user -intrh; 
~interrupt 
table 
entry 
206 
226 
0390 
00000000 
.word 
-user -intrh; 
• 
interrupt 
table 
entry 
207 
227 
0394 
00000000 
.word 
-user -intrh; 
J 
interrupt 
table 
entry 
208 
228 
0398 
00000000 
.word 
-user -intrh; 
~interrupt 
table 
entry 
209 
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229 
039c 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
210 
230 
0300 
00000000 
.word 
-user-intrh; 
# 
interrupt 
table 
entry 
211 
231 
0304 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
212 
232 
0308 
00000000 
.word 
-user -intrh; 
, interrupt 
table 
entry 
213 
233 
03ac 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
214 
234 
03bO 
00000000 
.word 
user -intrh; 
# 
interrupt 
table 
entry 
215 
235 
03b4 
00000000 
.word 
_user -intrh; 
# 
interrupt 
table 
entry 
216 
236 
03b8 
00000000 
.word 
-user -intrh: 
# 
interrupt 
table 
entry 
217 
237 
03bc 
00000000 
.word 
-user -intrh: 
# 
interrupt 
table 
entry 
218 
238 
03cO 
00000000 
.word 
_user -intrh; 
# 
interrupt 
table 
entry 
219 
239 
03c4 
00000000 
.word 
_user -intrh; 
# 
interrupt 
table 
entry 
220 
240 
03c8 
00000000 
.word 
_user -intrh; 
t 
interrupt 
table 
entry 
221 
241 
03cc 
00000000 
.word 
_user -intrh; 
# 
interrupt 
table 
entry 
222 
242 
03dO 
00000000 
.word 
-user-intrh; 
# 
interrupt 
table 
entry 
223 
243 
03d4 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
224 
244 
03d8 
00000000 
.word 
_user -intrh; 
, interrupt 
table 
entry 
225 
245 
03dc 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
226 
246 
03eO 
00000000 
.word 
user -intrh; 
# 
interrupt 
table 
entry 
227 
247 
03e4 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
228 
248 
03e8 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
229 
249 
03ec 
00000000 
.word 
-user -intrh; 
, interrupt 
table 
entry 
230 
250 
03fO 
00000000 
.word 
_user -intrh; 
i 
interrupt 
table 
entry 
231 
251 
03f4 
00000000 
.word 
_user -intrh; 
, interrupt 
table 
entry 
232 
252 
03f8 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
233 
253 
03fc 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
234 
254 
0400 
00000000 
.word 
user -intrh; 
# 
interrupt 
table 
entry 
235 
255 
0404 
00000000 
.word 
-user -intrh; 
, interrupt 
table 
entry 
236 
256 
0408 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
237 
257 
040c 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
238 
258 
0410 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
239 
259 
0414 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
240 
260 
0418 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
241 
261 
041c 
00000000 
.word 
_user -intrh; 
# 
interrupt 
table 
entry 
242 
262 
0420 
00000000 
.word 
-user -intrh; 
, interrupt 
table 
entry 
243 
263 
0424 
00000000 
.word 
_user -intrh; 
# 
interrupt 
table 
entry 
244 
264 
0428 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
245 
265 
042c 
00000000 
.word 
user -intrh; 
# 
interrupt 
table 
entry 
246 
266 
0430 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
247 
267 
0434 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
248 
268 
0438 
00000000 
.word 
-user -intrh; 
, interrupt 
table 
entry 
249 
269 
043c 
00000000 
.word 
-user -~ntrh; 
i 
interrupt 
table 
entry 
250 
270 
0440 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
251 
271 
0444 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
252 
272 
0448 
00000000 
.word 
-user -intrh; 
i 
interrupt 
table 
entry 
253 
273 
044c 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
254 
274 
0450 
00000000 
.word 
-user -intrh; 
# 
interrupt 
table 
entry 
255 


inter 


user_reserved 
() 
{} 
llser_machine() 
{} 
llser_trace() 
{} 
user_operation() 
{} 
user_arithmetic 
() 
{} 
user_real_arithmetic() 
{} 
user_constraint 
() 
{} 
useryrotection 
() 
{} 
user_type() 
{} 


user intrh () 
{ 
- 


) 


.text 


{ 
} >rom 


.data 
: 


{ 
} >ram 


.bss 
: 


{ 
} '>ram 


APPENDIX E 
CONSIDERATIONS 
FOR WRITING PORTABLE SOFTWARE 


This appendix describes those parts of the 80960KB processor 
design that are implementation 
dependent. 
This information 
is provided to facilitate the design of programs 
and kernel code 
that will be portable to other implementations 
of the 80960 architecture. 


The following 
aspects of the 80960KB's 
operation are deviations 
from the 80960KB architec- 
ture: 


1. 
On all bus write operations 
except those of the synmov, 
synmovl, 
and synmovq 
instruc- 
tions, the processor 
ignores the BADAC pin (i.e., errors signaled on "normal" 
writes are 
ignored). 


2. 
The check for out-of-range 
input values for the expr, exprl, logepr, and logeprl instruc- 
tions is omitted; out-of-range 
inputs yield an undefined result. 


3. 
Bits 5 and 6 of a machine-level 
instruction 
word in the REG and MEMB formats and bits 
o and 1 of the CTRL format are provided 
to designate 
special function 
registers. 
The 
80960KB processor has no special function registers. 


4. 
The 80960KB 
processor 
does not guarantee 
that the value in register 
r2 of the current 
frame is predictable. 


5. 
(The following 
is a note rather than a restriction.) 
When using the REG-format 
instruc- 
tions, the m bit for every operand that is not defined by the instruction 
should be set (e.g., 
code the unused operand 
as an arbitrary 
literal). 
This practice may reduce overhead 
in 
some situations. 


Stack frames 
in the 80960KB 
architecture 
are aligned 
on (SALIGN*16) 
byte boundaries. 
SALIGN is an implementation 
defined parameter. 
For the 80960KB processor, 
SALIGN is 4. 
Stack frames for this processor are thus aligned on 64 byte boundaries. 


The low-order N bits of the FP are ignored and always interpreted to be zero. The N parameter 
is defined by the following 
expression: 
SALIGN* 16 = 2N. Thus for the 80960KB processor, 
N is 6. 


The physical-address 
boundaries 
on which an operand 
begins 
has an impact 
on processor 
performance. 
For the 80960KB processor, the following is true: 


• 
An operand 
that spans more word boundaries 
than necessary 
(e.g., addressing 
a 32-bit 
operand 
on a non word boundary) 
suffers a moderate 
cost in speed because 
of extra bus 
and memory cycles. 


• 
An operand that spans a l6-byte boundary suffers a large cost in speed. 


• 
String operands that begin on non word boundaries 
suffer a moderate cost in speed. 
String 
operands that begin on word boundaries 
but not on l6-byte boundaries 
suffer a small cost 
in speed. 


The size of resumption 
records conditionally 
placed on the stack during faults and interrupts is 
16 bytes. 


The upper 
16M bytes of physical 
memory 
are reserved 
for special 
functions 
of local-bus 
components 
and lACs. 


The mechanism 
for sending, receiving, and handling lAC messages is not defined in the 80960 
architecture. 
It is a special implementation 
of the 80960KB processor.' 


The interrupt lAC message, the interrupt pins, and the interrupt register are not defined in the 
80960 architecture. 
They are special implementations 
for the 80960KB processor. 


The 
80960 
architecture 
does 
not 
define 
an initialization 
mechanism. 
The 
initialization 
mechanism 
and procedures 
described 
in this manual 
are implementation 
dependent 
for the 
80960KB processor. 


The synmov, 
synmovl, 
synmovq, 
and synld instructions 
are not defined in the 80960 architec- 
ture and are implementation 
dependent in the 80960KB processor. 


The LOCK pin is not defined in the 80960 architecture 
and is implementation 
dependent in the 


80960KB processor. 


80960KB 
EMBEDDED 32-BIT MICROPROCESSOR 
WITH INTEGRATED FLOATING-POINT 
UNIT 


• 
High-Performance 
Embedded 
Architecture 
- 
20 MIPS Burst Execution 
at 20 MHz 
-7.5 
MIPS· 
Sustained 
Execution 
at 
20 MHz 


• 
On-Chip 
Floating-Point 
Unit 
- 
Supports 
IEEE 754 Floating-Point 
Standard 
- 
Four 80-Bit Registers 
- 
4 Million Whetstones/Second 
at 
20 MHz 


• 
512-Byte 
On-Chip 
Instruction 
Cache 
- 
Direct Mapped 
- 
Parallel Load/Decode 
for Uncached 
Instructions 


• 
Multiple Register 
Sets 
- 
Sixteen 
Global 32-Bit Registers 
- 
Sixteen 
Local 32-Blt Registers 
- 
Four Local Register 
Sets Stored 
On-Chip 
- 
Register 
Scoreboarding 


• 
Built-In 
Interrupt 
Controller 
- 
32 Priority 
Levels 
- 
256 Vectors 
- 
Supports 
8259A 


• 
Easy to Use, High Bandwidth 
32-Blt Bus 


- 
53.3 MBytes/s 
Burst 
- 
Up to 16-Bytes 
Transferred 
per Burst 


• 
4 Gigabyte, 
Linear Address 
Space 


• 
132-Lead 
Pin Grid Array (PGA) Package 


The 80960KB 
is the first member of Intel's new 32-bit microprocessor 
family, the 960 series, which is designed 


especially 
for embedded 
applications. 
It is based on the family's 
high performance, 
common 
core architecture, 


and includes 
a 512-byte 
instruction 
cache, 
a built-in interrupt 
controller, 
and an integrated 
floating-point 
unit. 


The 80960KB 
has a large register set, multiple 
parallel execution 
units, and a high-bandwidth, 
burst bus. Using 
advanced 
RiSe technology, 
this high performance 
processor 
is capable 
of execution 
rates in excess 
of 7.5 


million 
instructions 
per second.' 
The 
80960KB 
is well-suited 
for a wide 
range 
of embedded 
applications, 


including 
image processing, 
industrial 
control, 
robotics, 
and telecommunications. 


BUS 
CONTROL 
LOGIC 
AND 
INTERRUPT 
32-BIT 


CONTROLLER 
BURST 


BUS 


The 80960KB is the first member of a new family of 
32-bit microprocessors from Intel known as the 960 
Series. This series was especially designed to serve 
the needs of embedded applications. The embed- 
ded market includes applications as diverse as in- 
dustrial automation, avionics, image processing, 
graphics, robotics, t~lecommunications, and auto- 
mobiles. These types of applications require high 
integration, low power consumption, quick interrupt 
response times, and high performance. Since time 
to market is critical, embedded microprocessors 
need to be easy to use in both hardware and soft- 
ware designs. 


All members of the 80960 series share a common 
core architecture which utilizes RISC technology so 
that, except for special functions, the family mem- 
bers are object code compatible. Each new proces- 
sor in the series will add its own special set of func- 
tions to the core to satisfy the needs of a specific 
application or range of applications in the embedded 
market. For example, future processors may include 
a DMA controller, a timer, or an AID 
converter. 


The 80960KB includes an integrated floating-point 
unit. Also available is the 80960MC, a military-grade 
version of the processor, and in the near future, the 
80960KA, another commercial version without float- 
ing-point will be available. 


gO 
0 


SIXTEEN 
GLOBAL 
32·BIT 
REGISTERS(1) 
REGISTERS 


g15 


fpO 
FLOATING- 
FOUR SO·BIT REGISTERS 
POINT 


fp3 
REGISTERS 


rO 


SIXTEEN 
LOCAL 
32·BIT 
REGISTERS(2) 
REGISTERS 


r15 


32·BITS 
ARITHMETIC CONTROLS 


32·BITS 
INSTRUCTION POINTER 


232-1 


32·BITS 
PROCESSCONTROLS 


ADDRESS 
SPACE 


NOTES: 
1. Register 
g15 is reserved 
for stack 
management 
functions. 


2. Registers 
rO, r1, and r2 are reserved 
for stack 
management 
functions. 


The 80960KB's architecture is based on the most 
recent advances in RiSe technology and is ground- 
ed in Intel's long experience in d~signing embedded 
controllers. 
Many 
features 
contribute 
to 
the 
80960KB's exceptional performance: 


1. Large Register Set. Having a large number of 
registers reduces the number of times that a proces- 
sor needs to access memory. Modern compilers can 
take advantage of this feature to optimize execution 
speed. For maximum flexibility, the 80960KB pro- 
vides 32 32-bit registers and four 80-bit f1oating- 
point registers. (See Figure 2.) 


2. Fast Instruction 
Execution. 
Simple functions 
make up the bulk of instructions in most programs, 


so that execution speed can be greatly improved by 
ensuring that these core instructions execute in as 
short a time as possible. The most-frequently exe- 
cuted instructions such as register-register moves, 
add/subtract, logical operations, and shifts execute 
in one to two cycles (Table 1 contains a list of in- 
structions.) 


3. Load/Store 
Architecture. 
Like other processors 
based on RiSe technology, the 80960KB has a 
Load/Store architecture, only the LOAD and STORE 
instructions reference memory; all other instructions 
operate on registers. This type of architecture simpli- 
fies instruction decoding and is used in combination 
with other techniques to increase parallelism. 


inter 


Data Movement 
Arithmetic 
Logical 
Bit and Bit 
Field 


Load 
Add 
And 
Set Bit 
Store 
Subtract 
Not And 
Clear Bit 
Move 
Multiply 
And Not 
Not Bit 


Load Address 
Divide 
Or 
Check Bit 


Remainder 
Exclusive 
Or 
Alter Bit 
Modulo 
Not Or 
Scan for Bit 
Shift 
Or Not 
Scan over Bit 
Extended 
Multiply 
Nor 
Extract 
Extended 
Divide 
Exclusive 
Nor 
Modify 
Not 
Nand 
Rotate 


Comparison 
Branch 
Call/Return 
Fault 


Compare 
Unconditional 
Call 
Conditional 
Fault 
Conditional 
Branch 
Call Extended 
Synchronize 
Faults 
Compare 
Conditional 
Branch 
Call System 
Compare 
and 
Compare 
and 
Return 
Increment 
Branch 
Branch and Link 
Compare 
and 
Decrement 


Debug 
Miscellaneous 
Decimal 


Modify Trace 
Atomic Add 
Move 
Controls 
Atomic 
Modify 
Add with Carry 
Mark 
Flush Local Registers 
Subtract 
with Carry 
Force Mark 
Modify Arithmetic 
Controls 
Scan Byte for Equal 
Test Condition 
Code 


Conversion 
Floating-Point 
Synchronous 


Convert 
Real to Integer 
Move Real 
Synchronous 
Load 
Convert 
Integer to Real 
Add 
Synchronous 
Move 
Subtract 
Multiply 
Divide 
Remainder 
Scale 
Round 
Square Root 
Sine 
Cosine 
Tangent 
Arctangent 
Log 
Log Binary 
Log Natural 
Exponent 
Classify 
Copy Real Extended 
Compare 


the a0960KB are 32-bits long and must be aligned 
on word boundaries. This alignment makes it possi- 
ble to eliminate the instruction-alignment stage in 
the pipeline. To simplify the instruction decoder fur- 
ther, there are only five instruction formats and each 
instruction uses only one format. (See Figure 3.) 


5. Overlapped 
Instruction 
Execution. 
A load oper- 
ation allows execution of subsequent instructions to 
continue before the data has been returned from 
memory, so that these instructions can overlap the 
load. The 80960KB manages this process transpar- 
ently to software through the use of a register score- 
board. Conditional instructions also make use of a 
scoreboard so that subsequent unrelated instruc- 
tions can be executed while the conditional instruc- 
tion is pending. 


6. Integer 
Execution 
Optimization. 
When the re- 
sult of an operation is used as an operand in a sub- 
sequent calculation, the value is sent immediately to 
its destination register. Yet at the same time, the 
value is put back on a bypass path to the ALU, 
thereby saving the time that otherwise would be re- 
quired to retrieve the value for the next operation. 


7. Bandwidth 
Optimizations. 
The 80960KB gets 
optimal use of its memory bus bandwidth because 
the bus is tuned for use with the cache: the line size 
of the instruction'cache matches the maximum burst 
size for instruction fetches. The 80960KB automati- 
cally fetches four words in a burst and stores them 
directly in the cache. Due to the size of the cache 
and the fact that it is continually filled in anticipation 
of needed instructions in the program flow, the 
80960KB is exceptionally insensitive to memory wait 
states. In fact, each wait state causes only a 7% 
degradation in system perfomance. The benefit is 
that the a0960KB will deliver outstanding perform- 
ance even with a low cost memory system. 


8. Cache 
Bypass. If there is a cache miss, the proc- 
essor fetches the needed instruction, then sends it 
on to the instruction decoder at the same time it 
updates the cache. Thus, no extra time is taken to 
load and read the cache. 


Memory Space and Addressing 
Modes 


The 80960KB offers a linear programming environ- 
ment so that all programs running on the processor 
are contained in a single address space. The maxi- 
mum size of the address space is 4 Gigabytes (232 
bytes). 


For ease of use, the a0960KB has a small number of 
addressing modes, but includes all those necessary 


level languages such as C, Fortran and Ada. Table 2 
lists the memory addressing modes. 


Data Types 


The 80960 KB recognizes the following data types: 


Numeric: 
• a-, 16-, 32- and 64-bit ordinals 
• a-, 16, 32- and 64-bit integers 
• 32-, 64- and 80-bit real numbers 


Non-Numeric: 
• Bit 
• Bit Field 
• Triple-Word (96 bits) 
• Quad-Word (12a bits) 


Large Register Set 


The programming environment of the 80960KB in- 
cludes a large number of registers. In fact, 36 regis- 
ters are available at any time. The availability of this 
many registers greatly reduces the number of mem- 
ory accesses required to execute most programs, 
which leads to greater instruction processing speed. 


There are two types of general-purpose registers: 
local and global. The 20 global registers consist of 
sixteen 32-bit registers (GO through G15) and four 
aO-bit registers (FPOthrough FP3). These registers 
perform the same function as the general-purpose 
.registers provided in other popular microprocessors. 
The term global refers to the fact that these regis- 
ters retain their contents across procedure calls. 


The local registers, on the other hand, are proce- 
dure specific. For each procedure call, the 80960KB 
allocates 16 local registers (ROthrough R15). Each 
local register is 32 bits wide. Any register can also 
be used for single or double-precision floating-point 
operations; the 80-bit floating-point registers are pro- 
vided for extended precision. 


To further increase the efficiency of the register set, 
multiple sets 'of local registers are stored on-chip. 
This cache holds up to four local register frames, 
which means that up to three procedure calls can be 
made without having to access the procedure stack 
resident in memory. 


Although programs may have procedure calls nest- 
ed many calls deep, a program typically oscillates 
back and forth between only two or three levels. As 


• 
12-Bit Offset 


• 
32-Bit Offset 


• 
Register-Indirect 


• 
Register + 12-Bit Offset 


• 
Register + 32-Bit Offset 


• 
Register + (Index-Register 
x Scale-Factor) 


• 
Register 
x Scale Factor + 32-Bit Displacement 


• 
Register + (Index-Register 
x Scale-Factor) 
+ 32-Bit Displacement 


a result, 
with 
four 
stack 
frames 
in the 
cache, 
the 
probability 
of there 
being a free frame on the cache 
when 
a call is made 
is very 
high. 
In fact, 
runs of 
representative 
C-Ianguage 
programs 
show that 80% 
of the calls are handled 
without 
needing 
to access 
memory. 


procedure 
stack in memory 
to make room for a new 
set of registers. 
Global 
register 
G 15 is used by the 
processor 
as the frame 
pointer 
(FP) for the proce- 
dure stack. 


If there 
are four 
or more 
active 
procedures 
and a 
new procedure 
is called, 
the processor 
moves 
the 
oldest set of local registers 
in the register cache to a 


Note that the global 
and floating-point 
registers 
are 
not exchanged 
on a procedure 
call, but retain their 
contents, 
making 
them 
available 
to all procedures 
for fast parameter 
passing. 
An illustration 
of the reg- 
ister cache 
is shown 
in Figure 4. 


ONE Of 
fOUR 
LOCAL 
REGISTER 
SETS 
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To further reduce memory accesses, the 80960KB 
includes a 512-byte on-chip instruction cache. The 
instruction cache is based on the concept of locality 
of reference; that is, most programs are not usually 
executed in a steady stream but consist of many 
branches and loops that lead to jumping back and 
forth within the same small section of code. Thus, by 
maintaining a block of instructions in a cache, the 
number of memory references required to read in- 
structions into the processor can be greatly reduced. 


To 
load the 
instruction cache, 
instructions are 
fetched in 16-byte blocks, so that up to four instruc- 
tions can be fetched at one time. An efficient 
prefetch algorithm increases the probability that an 
instruction will already be in the cache when it is 
needed. 


Code for small loops will often fit entirely within the 
cache, leading to a great increase in processing 
speed since further memory references might not be 
necessary until the program exits the loop. Similarly, 
when calling short procedures, the code for the call- 
ing procedure is likely to remain in the cache, so it 
will be there on the procedure's return. 


Register Scoreboarding 


The instruction decoder has been optimized in sev- 
eral ways. One of these optimizations is the ability to 
do instruction overlapping by means of register 
scoreboarding. 


Register scoreboarding occurs when a LOAD in- 
struction is executed to move a variable from memo- 
ry into a register. When the instruction is initiated, a 
scoreboard bit on the target register is set. When the 
register is actually loaded, the bit is reset. In be- 
tween, any reference to the register contents is ac- 
companied by a test of the scoreboard bit to insure 
that the load has completed before processing con- 
tinues. Since the processor does not have to wait for 
the LOAD to be completed, it can go on to execute 
additional instructions placed in between the LOAD 
instruction and the instruction that uses the register 
contents, as shown in the following example: 


LOAD R4, address 1 
LOAD R5, address 2 
Unrelated instruction 
Unrelated instruction 
ADD R4, R5, R6 


In essence, the two unrelated instructions between 
the LOAD and ADD instructions are executed for 
free (Le.,take no apparent time to execute) because 
they are executed while the register is being loaded. 
Up to three LOAD instructions can be pending at 
one time with three corresponding scoreboard bits 
set. By exploiting this feature, system programmers 
and compilers have a useful tool for optimizing exe- 
cution speed. 


In the 80960KB, floating-point arithmetic has been 
made an integral part of the architecture. Having the 
floating-point unit integrated on-chip provides two 
advantages. First, it improves the performance of 
the chip for floating-point applications, since no 
additional bus overhE;ladis associated with floating- 
point calculations, thereby leaving more time for oth- 
er bus operations such as I/O. Second, the cost of 
using floating-point operations is reduced because a 
separate coprocessor chip is not required. 


The 80960KB floating-point (real number) data types 
include single-precision (32-bit), double-precision 
(64-bit), and extended precision (80-bit) floating- 
point numbers. Any register may be used to execute 
floating-point operations. 


The processor provides hardware support for both 
mandatory and recommended portions of 
IEEE 
Standard 754 for floating-point arithmetic, including 
all arithmetic, exponential, logarithmic, and other 
transcendental functions. Table 3 shows execution 
times for some representative instructions. 


Table 3. Sample Floatlng·Polnt 
Execution 
Times (/los) at 20 MHz 


32-Bit 
54·Blt 


Add 
0.5 
0.7 
Subtract 
0.5 
0.7 
Multiply 
1.0 
1.8 
Divide 
1.8 
3.8 


Square Root 
5.0 
5.2 
Arctangent 
13.4 
17.5 
Exponent 
15.0 
16.7 
Sine 
20.3 
22.1 
Cosine 
20.3 
22.1 


inter 


High Bandwidth 
Local Bus 


An 80960KB CPU resides on a high-bandwidth ad- 
dress/data bus known as the local bus (L-Bus). The 
L-Bus provides a direct communication path be- 
tween the processor and the memory and I/O sub- 
system interfaces. The processor uses the local bus 
to fetch instructions, manipulate memory, and re- 
spond to interrupts. Its features include: 


• 32-bit multiplexed address/data path 
• Four-word burst capability, which allows transfers 
from 1 to 16 bytes at a time 
• High bandwidth reads and writes at 53 MBytes 
per second 
• Special signal to indicate whether a memory 
transaction can be cached 


Figure 5 identifies the groups of signals which con- 
stitute the L-Bus. Table 4 lists the function of the L- 
Bus and other processor-support signals, such as 
the interrupt lines. 


Interrupt 
Handling 


The 80960KB can be interrupted in one of two ways: 
by the activation of one of four interrupt pins or by 
sending a message on the processor's data bus. 


The 80960KB is unusual in that it automatically han- 
dles interrupts on a priority basis and tracks pending 
interrupts through its on-chip interrupt controller. 
Two of the interrupt pins can be configured to pro- 
vide 8259A handshaking for expansion beyond four 
interrupt lines. 


Debug Features 


The 80960KB has built-in debug capabilities. There 
are two types of breakpoints and six different trace 
modes. The debug features are controlled by two 
internal 32-bit registers, the Process-Controls Word 
and the Trace-Controls Word. By setting bits in 
these control words, a software debug monitor can 
closely control how the processor responds during 
program execution. 


The 80960KB has both hardware and software 
breakpoints. It provides two hardware breakpoint 
registers on-chip which can be set by a special com- 
mand to any value. When the instruction pointer 
matches the value in one of the breakpoint registers, 
the breakpoint will fire, and a breakpoint handling 
routine is called automatically. 


The 80960KB also provides software breakpoints 
through the use of two instructions, MARK and 
FMARK. These instructions can be placed at any 
point in a program and will cause the processor to 
halt execution at that point and call the breakpoint 
handling routine. The breakpoint mechanism is easy 
to use and provides a powerful debugging tool. 


Tracing is available for instructions (single-step exe- 
cution), calls and returns, and branching. Each dif- 
ferent type of trace may be enabled separately by a 
special 
debug 
instruction. 
In 
each 
case, 
the 
80960KB executes the instruction first and then 
calls a trace handling routine (usually part of a soft- 
ware debug monitor). Further program execution is 
halted until the trace routine is completed. When the 
trace event handling routine is completed, instruc- 
tion execution resumes at the next instruction. The 


\ 
LOCAL BUS SIGNAL GROUPS 
\ 
<~-----> 
CONTROL (ADDRESS,DATA, 
and 
OPERATION SIGNALS - 15 LINES) 
<..---------> 
ARBITRATION (2 
LINES) 


fnerl'teo~ompleteIY 
In naraware, 
greatly 
slmpllty 
me 
task of testing 
and debugging 
software. 


The 
80960KB 
has 
an 
automatic 
mechanism 
to 
handle 
faults. 
There 
are 
ten 
fault 
types 
including 
trace, arithmetic, 
and floating-point 
faults. When the 
processor 
detects 
a fault, 
it automatically 
calls the 
appropriate 
fault handling 
routine and saves the cur- 
rent instruction 
pointer and necessary 
state informa- 
tion to make efficient 
recovery 
possible. 
The proces- 
sor posts diagnostic 
information 
on the type of fault 
to a Fault Record. 
Like interrupt 
handling 
routines, 
fault 
handling 
routines 
are usually 
written 
to meet 
the needs of a specific 
application 
and are often in- 
cluded 
as part of the operating 
system 
or kernel. 


For each of the ten fault types, there are numerous 
subtypes 
that 
provide 
specific 
information 
about 
a 
fault. For example, 
a floating-point 
fault may have its 
subtype 
set to an Overflow 
or Zero-Divide 
fault. The 
fault handler 
can use this specific 
information 
to re- 
spond 
correctly 
to the fault. 


Upon reset, the 80960KB 
automatically 
conducts 
an 
extensive 
internal 
test of its major 
blocks 
of logic. 


zero cnecl< sum on tne first ei"gntworCfslfl-memory 
to ensure that the system has been loaded correctly. 
If a problem 
is discovered 
at any point 
during 
the 
self-test, 
the 
80960KB 
will assert 
its FAILURE 
pin 
and will not begin program 
execution. 
The self-test 
takes approximately 
47,000 
cycles to complete. 


System 
manufacturers 
can use the 80960KB's 
self- 
test 
feature 
during 
incoming 
parts 
inspection. 
No 
special 
diagnostic 
programs 
need to be written, 
and 
the test is both thorough 
and fast. The self-test 
ca- 
pability helps ensure that defective 
parts will be dis- 
co.vered before 
systems 
are shipped, 
and once 
in 
the field, the self-test 
makes 
it easier to distinguish 
between 
problems 
caused 
by processor 
failure 
and 
problems 
resulting 
from other causes. 


The 80960KB 
is fabricated 
using Intel's 
CHMOS 
/I, 
(Complementary 
High Speed 
Metal Oxide Semicon- 
ductor) 
process. 
This 
advanced 
technology 
elimi- 
nates the frequency 
and reliability 
limitations 
of older 
CMOS 
processes 
and opens 
a new 
era in micro- 
processor 
performance. 
It combines 
the 
high 
per- 


formance 
capabilities 
of 
Intel's 
industry-leading 
HMOS 
/II technology 
with the high density 
and low 
power 
characteristics 
of 
CMOS. 
The 
80960KB 
is 
available 
at 16 MHz and 20 MHz. A 25 MHz version 
will be available 
in the near future. 


Symbol 
Type 
Name and Function 


CLK2 
I 
SYSTEM 
CLOCK provides the fundamental 
timing for 80960KB 
systems. 
It is 
divided by two inside the 80960KB 
to generate 
the internal processor 
clock. 


LAD31 
I/O 
LOCAL ADDRESS/QATA 
BUS carries 32-bit physical addresses 
and data to and 
-LADo 
T.S. 
from memory. 
During an address 
(Ta>cycle, bits 2-31 
contain 
a physical word 
address 
(bits 0-1 
indicate SIZE; see below). During a data (Td) cycle, bits 0-31 
contain 
read or write data. The LAD lines are active HIGH and float to a high 
impedance 
state when not active. 


SIZE, which is comprised 
of bits 0-1 
of the LAD lines during a Ta cycle, specifies 
the size of a burst transfer 
in words. 
LAD 1 
LAD 0 


0 
0 
1 Word 
0 
1 
2 Words 
1 
0 
3 Words 
1 
1 
4 Words 


ALE 
0 
ADDRESS-LATCH 
ENABLE indicates 
the transfer 
of a physical address. 
ALE is 
T.S. 
asserted 
during a Ta cycle and deasserted 
before the beginning 
of the Td state. It 
is active LOW and floats to a high impedance 
state when the processor 
is idle or 
is at the end of any bus access. 


inter 


Symbol 
Type 
Name and Function 


ADS 
a 
ADDRESS/DATA 
STATUS 
indicates 
an address 
state. ADS is asserted 
every Ta 
0.0. 
state and deasserted 
during the the following 
Td state. For a burst transaction, 


ADS is asserted 
again every Td state where READY was asserted 
in the previous 
cycle. 


W/R 
a 
WRITE/READ 
specifies, 
during aT a cycle, whether 
the operation 
is a write or 
0.0. 
read. It is latched on-chip 
and remains valid during Td cycles. 


DT/R 
a 
DATA TRANSMIT/RECEIVE 
indicates 
the direction 
of data transfer 
to and from 
0.0. 
the L-Bus. It is low during Ta and Td cycles for a read or interrupt __ 
acknowledgement; 
it is high during Ta and Td cycles for a write. DT /R never 
changes 
state when DEN is asserted 
(see Timing Diagrams). 


DEN 
a 
DATA ENABLE is asserted 
during Td cycles and indicates 
transfer 
of data on the 
0.0. 
LAD bus lines. 


READY 
I 
READY indicates 
that data on LAD lines can be sampled 
or removed. 
If READY is 
not asserted 
during a Td cycle, the Td cycle is extended 
to the next cycle by 
inserting 
a wait state (Tw), and ADS is not asserted 
in the next cycle. 


LOCK 
I/O 
BUS LOCK prevents 
other bus masters from gaining control 
of the L-Bus 
0.0. 
following 
the current cycle (if they would assert LOCK to do so). LOCK is used by 
the processor 
or any bus agent when it performs 
indivisible 
Read/Modify/Write 
(RMW) operations. 


For a read that is designated 
as a RMW-read, 
LOCK is examined. 
if asserted, 
the 
pO'cessor 
waits until it is not asserted; 
if not asserted, 
the processor 
asserts 
L 
CK during the Ta cycle and leaves it asserted. 


A write that is designated 
as an RMW-write 
deasserts 
LOCK in the Ta cycle. 
During the time LOCK is asserted, 
a bus agent can perform 
a normal read or write 
but no RMW operations. 
LOCK is also held asserted 
during an interrupt- 
acknowledge 
transaction. 


BE3-BEa 
a 
BYTE ENABLE 
LINES s~ify 
which data bytes (up to four) on the bus take part 
0.0. 
in the current bus cycle. BE3 corresponds 
to LAD31-LAD24 
and BEa corresponds 
to LAD7-LADa. 


The byte enables 
are provided 
in advance 
of data. The byte enables 
asserted 
during Ta specify the bytes of the first data word. The byte enables 
asserted 
during Td specify the bytes of the next data word (if any), that is, the word to be 
transmitted 
following 
the next assertion 
of READY. The byte enables during the 
Td cycles preceding 
the last assertion 
of READY are undefined. 
The byte enables 
are latched on-chip and remain constant 
from one Td cycle to the next when 
READY is not asserted. 


For reads, the byte enables 
specify the byte(s) that the processor 
will actually 
use. 


L-Bus agents are required to assert only adjacent 
byte enables 
(e.g., asserting 
just 
BEa and BE2 is not permitted), 
and are required to assert at least one byte enable. 


Accesses 
must also be naturally aligned (e.g., asserting 
BE1 and BE2 is not 
allowed 
even though they are adjacent). 
To produce 
address 
bits Aa and A1 
externally, 
they can be decoded 
from the byte enables. 


inter 


Symbol 
Type 
Name and Function 


HOLD/ 
I 
HOLD: If the processor 
is the primary bus master (PBM), the input is interpreted 
HLDAR 
as HOLD, a request from a secondary 
bus master to acquire the bus. When the 
processor 
receives 
HOLD and grants another 
master control 
of the bus, it floats 
its tri-state 
bus lines and then asserts HLDA and enters the Th state. When HOLD 
is deasserted, 
the processor 
will deassert 
HLDA and go to either the Tj or Ta 
state. 


HOLD ACKNOWLEDGE 
RECEIVED: 
If the processor 
is a secondary 
bus master 
(SBM), the input is HLDAR, which indicates, 
when HOLDR output is high, that the 
processor 
has acquired the bus. Processors 
and other agents can be told at reset 
if they are the primary bus master (PBM). 


HLDAI 
0 
HOLD ACKNOWLEDGE: 
If the processor 
is a primary bus master, the output is 
HOLDR 
T.S. 
HLDA, which relinquishes 
control 
of the bus to another 
bus master. 


HOLD REQUEST: 
For secondary 
bus masters (SBM), the output is HOLDR, which 
is a request to acquire the bus. The bus is said to be acquired 
if the agent is a 
primary bus master and does not have its HLDA output asserted, 
or if the agent is 
a secondary 
bus master and has its HOLD input and HLDA output asserted. 


CACHE 
0 
CACHE indicates 
if an access is cacheable 
during a Ta cycle. It is not asserted 
T.S. 
during any synchronous 
access, such as a synchronous 
load or move instruction 
used for sending an lAC message. 
The CACHE signal floats to a high impedance 
state when the processor 
is idle. 


Symbol 
Type 
Name and Function 


BADAC 
I 
BAD ACCESS, 
if asserted 
in the cycle following 
the one in which the last READY 
of a transaction 
is asserted, 
indicates 
that an unrecoverable 
error has occurred 
on 
the current bus transaction, 
or that a synchronous 
load/store 
instruction 
has not 
been acknowledged. 


STARTUP: 
During system reset, the BADAC signal is interpreted 
differently. 
If the 
signal is high, it indicates 
that this processor 
will perform 
system initialization. 
If it 
is low, another 
processor 
in the system will perform 
system initialization 
instead. 


RESET 
I 
RESET clears the internal logic of the processor 
and causes it to re-initialize. 


During RESET assertion, 
the input pins are ignored (except for BADAC and 
lAC/I NT0), the tri-state output pins are placed in a high impedance 
state, and 
other output pins are placed in their non-asserted 
state. 


RESET must be asserted 
for at least 41 CLK2 cycles for a predictable 
RESET. 
The HIGH to LOW transition 
of RESET should occur after the rising edge of both 
CLK2 and the external 
bus CLK, and before the next rising edge of CLK2. 


FAILURE 
0 
INITIALIZATION 
FAILURE 
indicates 
that the processor 
has failed to initialize 
0.0. 
correctly. 
After RESET is deasserted 
and before the first bus transaction 
begins, 
FAILURE 
is asserted 
while the processor 
performs 
a self-test. 
If the self-test 
completes 
successfully, 
then FAILURE 
is deasserted. 
Next, the processor 
performs 
a zero checksum 
on the first eight words of memory. 
If it fails, FAILURE 
is asserted 
for a second time and remains asserted; 
if it passes, system 
initialization 
continues 
and FAILURE 
remains deasserted. 


N.C. 
N/A 
NOT CONNECTED 
indicates 
pins should not be connected. 
Never connect 
any 
pin marked N.C. 


Symbol 
Type 
Name and Function 


lAC 
I 
INTERAGENT 
COMMUNICATION 
REQUEST/INTERRUPT 
0 indicates 
either 
INTO 
that there 
is a pending 
lAC message 
for the processor 
or an interrupt. 
The bus 
interrupt 
control 
register determines 
in which way the signal should be interpreted. 


To signal an interrupt 
or lAC request 
in a synchronous 
system, this pin (as well as 


the other interrupt 
pins) must be enabled 
by being deasserted 
for at least one bus 
cycle and then asserted 
for at least one additional 
bus cycle; in an asynchronous 
system, 
the pin must remain 
deasserted 
for at least two bus cycles 
and then be 
asserted 
for at least two more bus cycles. 


LOCAL PROCESSOR 
NUMBER: This signal is interpreted 
differently 
during 
system reset. If the signal is at a high voltage 
level, it indicates 
that this processor 
is a primary bus master (Local Processor 
Number = 0); if it is at a low voltage 
level, it indicates 
that this processor 
is a secondary 
bus master (Local Processor 
Number = 1). 


INT1 
I 
INTERRUPT 
1, like INTO, provides 
direct interrupt 
signaling. 


INT2/ 
I 
INTERRUPT 
2/INTERRUPT 
REQUEST: The bus control 
registers 
determines 
INTR 
how this pin is interpreted. 
If INT2, it has the same interpretation 
as the INTO and 
INT1 pins. If INTR, it is used to receive an interrupt 
request from an external 
interrupt controller. 


INT3/ 
I/O 
INTERRUPT 
3/INTERRUPT 
ACKNOWLEDGE: 
The bus interrupt 
control 
register 


INTA 
0.0. 
determines 
how this pin is interpreted. 
If INT3, it has the same interpretation 
as 
the INTO, INT1, and INT2 pins. If INTA, it is used as an output to control 
interrupt- 
acknowledge 
bus transactions. 
The INTA output is latched on-chip and remains 
valid during Td cycles; as an output, it is open-drain. 


Power and Grounding 


The 80960KB 
is implemented 
in CHMOS 
III technol- 
ogy and 
has modest 
power 
requirements. 
Its high 


clock 
frequency 
and numerous 
output 
buffers 
(ad- 
dress/data, 
control, 
error, 
and 
arbitration 
signals) 


can cause 
power 
surges 
as multiple 
output 
buffers 
drive new signal levels simultaneously. 
For clean on- 
chip 
power 
distribution 
at high frequency, 
11 Vcc 
and 13 Vss pins separately 
feed functional 
units of 
the 80960KB. 


Power and ground 
connections 
must be made to all 
power 
and ground 
pins of the 80960KB. 
On the cir- 
cuit 
board, 
all Vcc 
pins 
must 
be strapped 
closely 
together, 
preferably 
on a power 
plane. 
Likewise, 
all 
Vss pins should 
be strapped 
together, 
preferably 
on 
a ground 
plane. 
These 
pins may not be connected 
together 
within the chip. 


Power Decoupling 
Recommendations 


Liberal 
decoupling 
capacitance 
should 
be 
placed 


near the 80960KB. 
The processor 
can cause 
tran- 


sient power 
surges when driving the L-Bus, particu- 


larly when it is connected 
to a large capacitive 
load. 


Low 
inductance 
capacitors 
and 
interconnects 
are 


recommended 
for best high frequency 
electrical 
per- 


formance. 
Inductance 
can be reduced 
by shortening 


tl"oe board 
traces 
between 
the 
processor 
and 
de- 


coupling 
capacitors 
as much as possible. 
Capacitors 
specifically 
designed 
for 
PGA 
packages 
are 
also 


commercially 
available 
and offer the lowest 
possible 


inductance. 


For reliable 
operation, 
always 
connect 
unused 
in- 


puts to an appropriate 
signal 
level. 
In particular, 
if 


one or more interrupt 
lines are not used, they should 


be pulled up. No inputs should 
ever be left floating. 


All open-drain outputs require a pullup device. While 
in some cases a simple pullup resistor will be ade- 
quate, we recommend a network of pullup and pull- 
down resistors biased to a valid VIH (~3.4Y) and 
terminated in the characteristic impedance of the cir- 
cuit board. Figure 6 shows our recommendations for 
the resistor values for both a low and high current 
drive network, which assumes that the circuit board 
has a characteristic impedance of 100.0.The advan- 
tage of terminating the output signals in this fashion 
is that it limits signal swing and reduces AC power 
consumption. 


Figure 7 shows the typical supply current require- 
ments over the operating temperature range of the 
processor at supply voltage (Vccl of 5V. Figure 8 
shows the typical power supply current (Iccl 
re- 
quired by the 80960KB at various operating frequen- 
cies when measured at three input voltage (Vccl 
levels. 


For a given output current (Iou, the curve in Figure 9 
shows the worst case output low voltage (Iou. 


80960KB 
OPEN-DRAIN 
OUTPUT 


Low Drive Network: 


• VOH = 3.42V 
• IOL = 25.3 mA 


Figure 10 shows the typical capacitive derating 
curve for the 80960KB measured from 1.5V on the 
system clock (ClK) to D.8V on the falling edge and 
2.0V on the rising edge of the LoBusaddress/data 
(LAD) signals. 


Figure 13 illustrates the load circuit used to test the 
80960KB's tristate pins, and Figure 14 shows the 
load circuit used to test the open drain outputs. The 
open drain test uses an active load circuit in the form 
of a matched diode bridge. Since the open-drain 
only sink current, however, only the 10Llegs of the 
bridge are necessary and the 10Hlegs are not used. 
When the 80960KB driver under test is turned off, 
the output pin is pulled up to VREF(Le.,VOH).Diode 
01 is turned off and the 10L current source flows 
through diode 02. 


When the 80960KB open-drain driver under test is 
on, diode 01 is also on, and the voltage on the pin 
being tested drops to VOL. Diode 02 turns off and 
tOLflows through diode 01. 


80960KB 
OPEN-DRAIN 
OUTPUT 


High Drive Network: 
• VOH = 3.41V 
• 10L = 33.8 mA 
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Capacitive 
Load (pF) 


Operating 
Temperature 
O·C to + 85·C Case 


Storage Temperature 
- 65·C to + 150·C 


Voltage 
on Any Pin 
-0.5V 
to VCC + 0.5V 


Power Dissipation 
.....•........... 
2.9W (20 MHz) 


lute Maximum Ratings" may cause permanent dam- 
age to the device. This is a stress rating only and 
functional operation of the device at these or any 
other conditi.onsabove those indicated in the opera- 
tional sections of this specification is not implied. Ex- 
posure to absolute maximum rating conditions for 
extended periods may affect device reliability. 


NOTICE Specifications contained within the 
following tables are subject to change. 


D.C. CHARACTERISTICS 


80960KB 
(16 MHz): TCASE = O·C to + 
85·C, VCC = 5V ± 10% 


80960KB 
(20 MHz): TCASE = O·C to + 
85·C, VCC = 5V ± 5% 


Symbol 
Parameter 
Min 
Max 
Units 
Test Conditions 


Vil 
Input Low Voltage 
-0.3 
+0.8 
V 


VIH 
Input High Voltage 
2.0 
VCC + 0.3 
V 


VCl 
CLK2 Input Low Voltage 
-0.3 
+1.0 
V 


VCH 
CLK2 Input High Voltage 
0.55 VCC 
Vcc 
+ 0.3 
V 


Val 
Output 
Low Voltage 
0.45(5) 
V 
(1) 
0.60(6) 


VOH 
Output High Voltage 
2.4 
V 
(2,4) 


ICC 
Power Supply Current: 


16 MHz 
475 
mA 
TA = O·C 


20 MHz 
545 
mA 
TA = O·C 


III 
Input Leakage Current 
- 
±15 
J-LA 
0,.;; Va";; 
VCC 


ILO 
Output Leakage Current 
±15 
J-LA 
0.45 ,.;;Va ,.;;Vcc 


CIN 
Input Capacitance 
10 
pF 
fC = 1 MHz(3) 


Co 
I/O or Output Capacitance 
12 
pF 
fC = 1 MHz(3) 


CClK 
Clock Capacitance 
10 
pF 
Ic = 1 MHz(3) 


NOTES: 
1. For tri-state 
outputs, 
this parameter 
is measured 
at: 


Address/Data 
........................................•.......................................•........ 
.4.0 mA 
Controls 
.........................•......................•..............................•............... 
5.0 mA 
2. This parameter 
is measured 
at: 


Address/Data 
.......................•..•................•.............•..•..•........................ 
-1.0 mA 


Controls 
................................................•..•..•.............•..•..•.................. 
-0.9 mA 
ALE .......................................................•..•...................................... 
- 5.0 mA 
3. Input, output, 
and clock 
capacitance 
are not tested. 


4. Not measured 
on open-drain 
outputs. 


5. For open-drain 
outputs 
.....................................•..•................•........................ 
25 mA 


6. For open-drain 
outputs 
..............•..•.............•..•................•..•...................•..•.... 
40 mA 


inter 


This section 
describes 
the AC specifications 
for the 
80960KB 
pins. All input and output timings are spec- 
ified relative 
to the 
1.5V level of the rising edge of 
CLK2, 
and 
refer 
to 
the 
time 
at which 
the 
signal 


reaches 
(for output 
delay and input setup) or leaves 
(for hold time) the TIL 
levels of LOW (0.8V) or HIGH 
(2.0V). All AC testing 
should be done with input volt- 
ages of OAV and 2AV, 
except for the clock 
(CLK2), 


which 
should be tested 
with input voltages 
of OA5V 
and 0.55 Vcc. 


INPUTS(l): 


LAD31-LADo, 


BADAC, 


IAC/INTo,INT 
l' 


INT2/INTR,INT 3 


INPUTS(2): 


HOLD,HLDAR, 
LOCK, 
READY 


HOLD AFTER ALE 
INACTIVE 


inter 
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A.C. Specification 
Tables 


80960KB 
A.C. Characteristics 
(16 MHz) 


Symbol 
Parameter 
Mln 
Max 
Units 
Test Conditions 


T1 
Processor 
Clock 
31.25 
125 
ns 
VIN = 1.5V 
Period (CLK2) 


T2 
Processor 
Clock 
11 
ns 
VIL = 10% Point 
Low Time (CLK2) 
= 1.2V 


T3 
Processor 
Clock 
11 
ns 
VIL = 90% Point 
High Time (CLK2) 
= 0.1V + 0.5 Vcc 


T4 
Processor 
Clock 
10 
ns 
VIN = 90% Point to 10% 
Fall Time (CLK2) 
Point 


Ts 
Processor 
Clock 
10 
ns 
VIN = 10% Point to 90% 
Rise Time (CLK2) 
Point 


Ts 
Output Valid 
5 
35 
ns 
CL = 100 pF (LAD) 
Delay 
CL = 75 pF (Controls) 


T7 
ALE Width 
15 
ns 
CL = 75 pF 


Te 
ALE Output Valid Delay 
5 
20 
ns 
CL = 75 pF(2) 


Tg 
Output Float 
5 
20 
ns 
CL = 100 pF (LAD) 
Delay 
CL = 75 pF (Controls)(2) 


TlO 
Input Setup 1 
3 
ns 


T11 
Input Hold 
10 
ns 
~ 


T12 
Input Setup 2 
8 
ns 


T13 
Setup to ALE 
10 
ns 
CL = 100 pF (LAD) 
Inactive 
CL = 75 pF (Controls) 


T14 
Hold after ALE 
8 
ns 
CL = 100 pF (LAD) 
Inactive 
CL = 75 pF (Controls) 


T15 
Reset Hold 
5 
ns 


T16 
Reset Setup 
8 
ns 


T17 
Reset Width 
1281 
ns 
41 CLK2 Periods Minimum 


NOTES: 
1. iAC/INTo. INT1. INT2/INTR. INT3 can be asynchronous. 
2. A float condition 
occurs 
when the maximum 
output 
current 
becomes 
less than ILQ. Float delay is not tested. 
but should 
be 
no longer 
than the valid delay. 
• 


inter 


Symbol 
Parameter 
Mln 
Max 
Units 
Test Conditions 


T1 
Processor 
Clock 
25 
125 
ns 
VIN = 1.5V 
Period (CLK2) 


T2 
Processor 
Clock 
8 
ns 
VIL = 10% Point 
Low Time (CLK2) 
= 1.2V 


T3 
Processor 
Clock 
8 
ns 
VIL = 90% Point 
High Time (CLK2) 
= O.W + 0.5 VCC 


T4 
Processor 
Clock 
10 
ns 
VIN = 90% Point to 10% 
Fall Time (CLK2) 
Point 


T5 
Processor 
Clock 
10 
ns 
VIN = 10% Point to 90% 
Rise Time (CLK2) 
Point 


T6 
Output Valid 
5 
30 
ns 
CL = 60 pF (LAD) 
Delay 
~ 
CL = 50 pF (Controls) 


T7 
ALE Width 
12 
ns 
CL = 50 pF 


Ta 
ALE Output Valid Delay 
5 
20 
ns 
CL = 50pF(2) 


Tg 
Output Float 
5 
20 
ns 
CL = 60 pF (LAD) 
Delay 
CL = 50 pF (Controls)(2) 


T10 
Input Setup 1 
3 
ns 


T11 
Input Hold 
10 
ns 


T12 
Input Setup 2 
7 
ns 


T13 
Setup to ALE 
10 
ns 
CL = 60 pF (LAD) 
Inactive 
CL = 50 pF (Controls) 


T14 
Hold after ALE 
8 
ns 
CL = 60 pF (LAD) 
Inactive 
CL = 50 pF (Controls) 


T15 
Reset Hold 
5 
ns 


T16 
Reset Setup 
7 
'" 
ns 


Tn 
Reset Width 
1025 
ns 
41 CLK2 Periods Minimum 


NOTES: 
1. lAC/INTo. INT1. INT2/INTR. INT3 can be asynchronous. 
2. A float condition occurs when the maximum output current becomes less than ILQ.Float delay is not tested. but should be 
no longer than the valid delay. 


10LTested at 25 and 40 mA 


VREF = Vcc 
0, and 02 are matched 
270565-32 
Figure 
13. Test Load Circuit 
for 
TRI·STATE 
Output 
Pins 
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I 
I 
••• 
I••t--------'~ 
T•• ~ 


CLK2 


CLK 
••• 
••• 
RESET 


T17 


OUTPUTS 
••• 


FIRST 


A 
8 
C 
0 
A 


INIT PARAMETERS (BADAC. 
~ 
IAC8) MUST BE SETUP 8 CLOCKS 
PRI 
R TO THIS CLK2 EDGE 


INIT PARAMETERS MUST BE HELD 
BEYOND THIS CLK2 EDGE 


T15 = RESET HOLD 
T16 = RESET SETUP 
T17 = RESET WIDTH 


inter 


PRIWARY 
SECONDARY 
--_:_:_~ ..:_::_-- 


D£U,y or 
5 n. WlHlWUW 


IS REQUIRED 


Figure 17. Hold Timing 


product consists of a probe module, cable, and con- 
trol unit. Because of the high operating frequency of 
80960KB systems, the probe module connects di- 
rectly to the 80960KB socket. 


Input hold times can be disregarded by the designer 
whenever the input is removed because a subse- 
quent output from the processor is deasserted (e.g., 
DEN becomes deasserted). 


In other words, whenever the processor generates 
an output that indicates a transition into a subse- 
quent state, the processor must have sampled any 
inputs for the previous state. As an example, in the 
:!L....9Yclefollowing a read, the minimum time that 
DEN becomes deasserted is 5 ns, but the minimum 
hold time on the data is 10 ns. When DEN is deas- 
serted, however, the data is guaranteed to have 
been sampled. 


Similarly, whenever the processor generates an out- 
put that indicates a transition into a subsequent 
state, any outputs that are specified to be tri-stated 
in this new state are guaranteed to be tri-stated. For 
example, in the Td cycle following a Ta cycle for a 
read, the minimum output delay of DEN is 5 ns, but 
the maximum float time of LAD is 20 ns. When DEN 
is asserted, however, the LAD outputs are guaran- 
teed to have been tri-stated. 


Designing for the ICE-960KB 


The 80960KB In-Circuit Emulator assists in debug- 
ging 80960KB hardware and software designs. The 


When designing an 80960KB hardware system that 
uses the ICE-960KB to debug the system, several 
electrical and mechanical characteristics should be 
considered. These considerations include capacitive 
loading, drive requirement, power requirement, and 
physical layout. 


The ICE-960KB probe module increases the load 
capacitance of each line by up to 25 pF. It also adds 
one standard Schottky TTL load on the CLK2 line, 
up to one advanced low-power Schottky TTL load 
for each control signal line, and one advanced low- 
power Schottky TTL load for each addressIdata and 
byte enable line. These loads originate from the 
probe module and are driven by the 80960KB proc- 
essor. 


To achieve high noise immunity, the ICE-960KB 
probe is powered by the user's system. The high- 
speed probe circuitry draws up to 1.1A plus the max- 
imum current (Icel of the 80960KB processor. 


The mechanical considerations are shown in Figure 
18, which illustrates the lateral clearance require- 
ments for the ICE-960KB probe as viewed from 
above the socket of the 80960KB processor. 
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I 
: 
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.~~_~ 
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Figure 18. ICE-960KB 
Lateral Clearance 
Requirements 
MECHANICAL 
DATA 
wire wrap. Several 
applicable 
sockets 
are shown 
in 


Figure 22. 
Pin Assignment 


The 80960KB 
pinout 
as viewed 
from the substrate 


side 
of the 
component 
is shown 
in Figure 
19 and 


from the pin side in Figure 20. 


Vcc 
and GND connections 
must be made to multi- 


ple Vcc and GND pins. Each Vcc and GND pin must 
be connected 
to the appropriate 
voltage 
or ground 


and externally 
strapped 
close to the package. 
Pref- 
erably, 
the circuit 
board 
should 
include 
power 
and 


ground 
planes 
for power 
distribution. 
Table 
5 and 


Table 6 list the function 
of each pin. 


NOTE: 
Pins identified 
as N.C., "No Connect," 
should 
never 
be 
connected 
under 
any 
circumstances. 
The 


80960KB 
component 
contains 
54 N.C. pins. 


Package Dimensions 
and Mounting 


The 
80960KB 
is packaged 
in a 132-lead 
ceramic 


pin-grid 
array 
(PGA). 
Pins in this 
package 
are ar- 


ranged 
0.100 
inch 
(2.54mm) 
center-to-center, 
in a 


14 by 14 matrix, three rows around. 
(See Figure 21.) 


A wide variety 
of available 
sockets 
allow 
low-inser- 


tion or zero-insertion 
force 
mountings, 
and a choice 


of terminals 
such 
as soldertail, 
surface 
mount, 
or 


Package Thermal Specification 


The 80960KB 
is specified 
for operation 
when 
case 
temperature 
is within 
the range O·C to + 85·C. The 


PGA case temperature 
should 
be measured 
at the 


center 
of the top surface 
of the package 
opposite 


the pins as shown 
in Figure 23. 


The ambient temperature 
can be calculated 
from the 
0jc and 0ja by using the following 
equations: 


TJ = Tc 
+P*Ojc 
TA = TJ - 
P*O·a 
Tc = TA + 
P*[Oja - 
Ojcl 


Values for 0ja and 0jc are given in Table 7 at various 
airflows. 
Note that 0ja can be reduced 
by attaching 
a 


heatsink 
to the 
package. 
The 
maximum 
allowable 


ambient 
temperature 
(TA) permitted 
without 
exceed- 


ing TC is shown by the curve in Figure 25. The curve 
assumes 
an 
Icc 
of 545 
mA, Vcc 
of 5.0V, 
and 
a 


TCASE of + 85·C. 


Figures 24 through 
31 show the waveforms 
for vari- 


ous transactions 
on the 80960KB's 
local bus. 
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1. 
13 
12 
11 
10 
9 
8 
7 
6 
5 
• 
3 
2 
1 


P 
0 
0 0 
0 
0 
0 
0 0 
0 
0 
0 
0 0 
0 
p 
Vee 
Vss 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
Vee 


N 
0 0 
0 0 
0 
0 
0 0 
0 
0 
0 
0 0 
0 
N 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
N.C. 
Vss 


W 
0 
0 
0 
0 
0 0 
0 
0 
0 0 
0 
0 
0 
0 
W 
N.C. 
N.C. 
N.C. 
Vee 
Vss 
N.C. 
N.C. 
N.C. 
N.C. 
Vcc 
Vss 
Vss 
N.C. 
N.C. 


L 
0 
0 
0 
0 
0 
0 
L 
N.C. 
N.C. 
Vss 
Vee 
N.C. 
DEN 


K 
0 
0 
0 
0 0 
0 
K 
N.C. 
N.C. 
Vee 
Vss 
F'AIL 
BE3 


J 
0 
0 0 
0 
0 
0 
J 
N.C. 
N.C. 
N.C. 
Vss 
BE2 
DTR 


H 
0 
0 
0 
0 0 
0 
H 
N.C. 
N.C. 
N.C. 
80960KB 
LOCK 
BEo 
WR 


G 
0 
0 
0 
0 0 
0 
G 
N.C. 
N.C. 
N.C. 
BEl 
READY LAD30 


F' 
0 
0 
0 
0 
0 
0 
F' 
N.C. 
N.C. 
N.C. 
CACHE LAD31 LAD29 


E 
0 
0 
0 
0 
0 0 
E 
N.C. 
. Vss 
N.C. 
LAD27 
LAD26 
LAD28 


D 
0 
0 0 
0 
0 
0 
D 
N.C. 
N.C. 
Vee 
HLDA 
ADS 
ALE 


C 
0 
0 
0 
0 
0 
0 0 
0 
0 
0 0 
0 
0 
0 
C 
INTO 
INTI 
INT3 
Vss 
Vee 
LAD3 
LAD8 
LAD13 LAD20 
Vss 
Vee 
BADAC LAD25 
HOLD 


B 
0 
0 
0 
0 0 
0 
0 
0 0 
0 
0 
0 
0 
0 
B 
Vss 
RESET 
LADo 
CLK 
LAD2 
LAD6 
LAD10 
LAD12 
LAD15 
LAD18 
LAD21 LAD22 
LAD24 
LAD23 


A 
0 
0 
0 
0 
0 
0 0 
0 
0 
0 0 
0 
0 
0 
A 
Vcc 
INT2 
LAD1 
LAD4 
LAD5 
LA~ 
LAD9 
LAD11 LAD14 
LAD16 
LAD17 
LAD19 
Vss 
Vee 


1. 
13 
12 
11 
10 
9 
8 
7 
6 
5 
• 
3 
2 
1 


270565-11 


Pin 
Signal 
Pin 
Signal 
Pin 
Signal 
Pin 
Signal 


A1 
Vcc 
C6 
LAD20 
H1 
W/R 
M10 
Vss 


A2 
Vss 
C7 
LAD13 
H2 
8Eo 
M11 
Vcc 


A3 
LAD19 
C8 
LADs 
H3 
LOCK 
M12 
N.C. 


A4 
LAD 17 
C9 
LAD3 
H12 
N.C 
M13 
N.C. 


A5 
LAD16 
C10 
Vcc 
H13 
N.C. 
M14 
N.C. 


A6 
LAD14 
C11 
Vss 
H14 
N.C. 
N1 
Vss 


A7 
LAD11 
C12 
INT3/INTA 
J1 
DT/R 
N2 
N.C. 


A8 
LAD9 
C13 
INT1 
J2 
8E2 
N3 
N.C. 


A9 
LAD7 
C14 
lAC/INTo 
J3 
Vss 
N4 
N.C. 


A10 
LADs 
01 
ALE 
J12 
N.C 
N5 
N.C. 


A11 
LAD4 
02 
ADS 
J13 
N.C. 
N6 
N.C. 


A12 
LAD1 
03 
HLDA/HLDR 
J14 
N.C. 
N7 
N.C. 


A13 
INT2/INTR 
012 
Vcc 
K1 
8E3 
N8 
N.C. 


A14 
Vcc 
013 
N.C. 
K2 
FAILURE 
N9 
N.C. 


81 
LAD23 
014 
N.C. 
K3 
Vss 
N10 
N.C. 


82 
LAD24 
E1 
LAD2S 
K12 
Vcc 
N11 
N.C. 


83 
LAD22 
E2 
LAD26 
K13 
N.C. 
N12 
N.C. 


84 
LAD21 
E3 
LAD27 
K14 
N.C. 
N13 
N.C. 


85 
LAD1s 
E12 
N.C. 
L1 
DEN 
N14 
N.C. 


B6 
LAD1s 
E13 
Vss 
L2 
N.C. 
P1 
Vcc 


B7 
LAD12 
E14 
N.C. 
L3 
Vcc 
P2 
N.C. 


B8 
LAD 10 
F1 
LAD29 
L12 
Vss 
P3 
N.C. 


B9 
LAD6 
F2 
LAD31 
L13 
N.C. 
P4 
N.C. 


B10 
LAD2 
F3 
CACHE 
L14 
N.C. 
P5 
N.C. 


B11 
CLK2 
F12 
N.C. 
M1 
N.C. 
P6 
N.C. 


B12 
LADo 
F13 
N.C. 
M2 
N.C. 
P7 
N.C. 


813 
RESET 
F14 
N.C. 
M3 
Vss 
P8 
N.C. 


814 
Vss 
G1 
LAD30 
M4 
Vss 
P9 
N.C. 


C1 
HOLD/HLDAR 
G2 
READY 
M5 
Vcc 
P10 
N.C. 


C2 
LAD2S 
G3 
8E1 
M6 
N.C. 
P11 
N.C. 


C3 
BADAC 
G12 
N.C. 
M7 
N.C. 
P12 
N.C. 


C4 
Vcc 
G13 
N.C. 
M8 
N.C. 
P13 
Vss 


C5 
Vss 
G14 
N.C. 
M9 
N.C. 
P14 
Vcc 


inter 


Signal 
Pin 
Signal 
Pin 
Signal 
Pin 
Signal 
Pin 


ADS 
D2 
LAD15 
86 
N.C. 
J14 
N.C. 
P8 


ALE 
D1 
LAD16 
A5 
N.C. 
K13 
N.C. 
P9 


8ADAC 
C3 
LAD17 
A4 
N.C. 
K14 
N.C. 
P10 


8Eo 
H2 
LAD16 
85 
N.C. 
L13 
N.C. 
P11 


8E1 
G3 
LAD19 
A3 
N.C. 
L14 
N.C. 
P12 


8E2 
J2 
LAD20 
C6 
N.C. 
M1 
N.C. 
L2 


8E3 
K1 
LAD21 
84 
N.C. 
M2 
READY 
G2 


CACHE 
F3 
LAD22 
83 
N.C. 
M6 
RESET 
813 


CLK2 
811 
LAD23 
81 
N.C. 
M7 
VCC 
A1 


DEN 
L1 
LAD24 
82 
N.C. 
M8 
VCC 
A14 


DT/R 
J1 
LAD25 
C2 
N.C. 
M9 
VCC 
C4 


FAILURE 
K2 
LAD26 
E2 
N.C. 
M12 
Vcc 
C10 


HLDA/HOLDR 
D3 
LAD27 
E3 
N.C. 
M13 
Vcc 
D12 


HOLD/HLDAR 
C1 
LAD26 
E1 
N.C. 
M14 
VCC 
K12 


lAC/INTo 
C14 
LAD29 
F1 
N.C. 
N2 
Vcc 
L3 


INT1 
C13 
LAD30 
G1 
N.C. 
N3 
Vcc 
M5 


INT2/INTR 
A13 
LAD31 
F2 
N.C. 
N4 
VCC 
M11 


1NT3/1NTA 
C12 
LOCK 
H3 
N.C. 
N5 
Vcc 
P1 


LADO 
812 
N.C. 
D13 
N.C. 
N6 
VCC 
P14 


LAD1 
A12 
N.C. 
D14 
N.C. 
N7 
Vss 
A2 


LAD2 
810 
N.C. 
E12 
N.C. 
N8 
VSS 
814 


LAD3 
C9 
N.C. 
E14 
N.C. 
N9 
VSS 
C5 


LAD4 
A11 
N.C. 
F12 
N.C. 
N10 
Vss 
C11 


LAD5 
A10 
N.C. 
F13 
N.C. 
N11 
Vss 
E13 


LAD6 
89 
N.C. 
F14 
N.C. 
N12 
VSS 
J3 


LAD7 
A9 
N.C. 
G12 
N.C. 
N13 
Vss 
K3 


LADe 
C8 
N.C. 
G13 
N.C. 
N14 
Vss 
L12 


LAD9 
A8 
N.C. 
G14 
N.C. 
P2 
Vss 
M3 


LAD10 
88 
N.C. 
H12 
N.C. 
P3 
Vss 
M4 


LAD11 
A7 
N.C. 
H13 
N.C. 
P4 
Vss 
M10 


LAD12 
87 
N.C. 
H14 
N.C. 
P5 
Vss 
N1 


LAD13 
C7 
N.C. 
J12 
N.C. 
P6 
Vss 
P13 


LAD14 
A6 
N.C. 
J13 
N.C. 
P7 
W/R 
H1 


inter 


.................••.............•...•• 
........ 
_ 
m 
,..... 
_ 
m 
,..... 
~ 
~ 
N 
~ 
m 
o. 
U) 
0 
.•••.co 
"It 
(7) 
....,. 
....,. 
~ 
oq 
~ 
Cl! 
~ 
,,; 
cD 
cxi 


.::.e.~.e.::..::..::..::. 


C1N 
111 POSITION 


1 
.@@@@@@'I@@@@@@@ 


2 
@@@@@@@@@@@@@@ 


3 
@ @ !il @ @ @ @,@ @ @ @ !il @ @ 


4 
@@@ 
@@@ 


5 
@@@ 
@@@ 


6 
@@@ 
I 
@@@ 


7 
@@@ 
+ 
@@@ 
8 
-@@@ 
-- 
-- 
@@@ 


9 
@@@ 
I 
@@@ 


10 
@@@ 
@@@ 


11 
@@@ 
@@@ 


12 
@@!il@@@@,@@@@ 
@@ 


13 
@@@@@@@I@@@@@@@ 


14 
@@(!)(!)@@@,@@@@@@@ 


ABC 
D 
E 
F 
G 
H 
J 
K 
L 
t.l 
N 
PIi 


.020 (0.508) 
.020 --l 
t.lIN TYP 
(0.508) 


-- 
.070 (1.777) 
DIA 
TYP BRAZE 
PAD 
1.450 (36.802) 
• 


.725 (18.401) 


.650 (16.497) 


.550 (13.959) 


.450 (11.421) 


.350 (8.883) 


.250 (6.345) 


.150 (3.807) 


.050 (1.269) 
o 


SWEDGE 
PIN 
STANDOFF 
(4) 
PLACES 


.001 (0.025) R 


t.lIN TYP 


.018(0.47) 1 
DIA TYP 
_ 


.165(4.189~1 
~ 


.110(2::U 


• low 
insertion 
lorce 
(1IF) soldertail 
55274-1 
• Amp 
tests 
indicate 
50% 
reduction 
in 
insertion 
force 
compared 
to 
machined 
sockets 
Other 
socket 
options 
• Zero 
insertion 
lorce 
(ZIF) soldertail 
55583-1 
• Zero 
insertion 
force 
(ZIF) 
Burn-in 
version 
55573-2 
Amp 
Incorporated 
(Harrisburg, 
PA 17105 
U.S.A 
Phone 
717-564-0100) 


Peel-A-Way' 
Mylar 
and Kapton 
Socket 
Terminal 
Carriers 


• Low 
insertion 
force 
surface 
mount 
CS132·37TG 


• Low 
insertion 
force 
soldertail 
CS132-0HG 


• Low insertion 
force 
wire-wrap 
CS132-02TG 
(two-level) 
CS132-03TG 
(thee-level) 


• Low 
insertion 
force 
press-fit 
CS132-05TG 


Cam handle 
locks 
in low prolile 
position 
when 
80960KB 
is installed 
(handle 
UP lor open 
and DOWN 
lor closed 
positions). 


Courtesy 
Amp Incorporated 


Peel-A-Way 
Carrier 
No. 132: 
Kapton 
Carrier 
is KS 132 
Mylar 
Carrier 
is MS132 


Molded 
Plastic 
Body 
KS 132 
is shown 
below: 


Advanced 
Interconnections 
(5 Division 
Street) 
Warwick, 
RI 02818 
U.S.A. 


Phone 
401-885-0485) 


SOLDER 
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ti 


ItEfL·A.·WAY 


M 
T~ 
..• 
u 
U 
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270565-15 
Courtesy 
Advanced 
Interconnections 
(Peel-A-Way 
Terminal 
Carriers 


U.S. Patent 
No. 4442938) 


inter 
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To 
Td 
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ClK2 


ClK 


LAD31- 


LADo 
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BE3-BEO 


w/P. 


DT/P. 


DEN 


READY 


intJ 


Thermal Resistance-°C/Watt 


Airflow-ft./min 
(m/sec) 
Parameter 
0 
50 
100 
200 
400 
600 
800 
(0) (0.25) 
(0.50) 
(1.01) 
(2.03) 
(3.04) 
(4.06) 


(J Junction-to-Case 
(Case Measured 
2 
2 
2 
2 
2 
2 
2 
as Figure 6-4) 


(J Case-to-Ambient 
19 
18 
17 
15 
12 
10 
9 
(No Heatsink) 


(J Case-to-Ambient 
(with Omnidirectional 
16 
15 
14 
12 
9 
7 
6 
Heatsink) 


(J Case-to-Ambient 
(with Unidirectional) 
15 
14 
13 
11 
8 
6 
5 
Heatsink) 


NOTES: 
1. Table 
7 applies 
to a0960KB 
PGA 
plugged 
into 
socket 
or 
sOldered 
di- 
rectly 
into board. 


2.0jA 
= OjC + 0CA· 


3. OJ.CAP = 4°C/w 
(approx.) 
OJ.PIN = 4°C/w 
(inner pins) (approx.) 
OJ.PIN = a"c/w (outer pins) (approx.) 
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NOTE: 
INTR can go low no sooner than 10 ns (input hold time) following the beginning of interrupt acknowledgement cycle 1. 
For a second interrupt to be acknowledged. INTR must be low for at least three cycles before it can be reasserted. 
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376™ HIGH PERFORMANCE 
32-BIT EMBEDDED PROCESSOR 


• 
Full 32-Bit Internal 
Architecture 
- 
8-, 16-, 32-Bit Data Types 
- 
8 General 
Purpose 
32-Bit Registers 
- 
Extensive 
32-Bit Instruction 
Set 


• 
High Performance 
16-Bit Data Bus 
-16 
MHz CPU Clock 
- 
Two-Clock 
Bus Cycles 
-16 
Mbytes/Sec 
Bus Bandwidth 


• 
16 Mbyte 
Physical 
Memory 
Size 


• 
High Speed 
Numerics 
Support 
with the 
80387SX 


• 
Low System 
Cost with the 82370 
Integrated 
System 
Peripheral 


• 
On-Chip 
Debugging 
Support 
Including 
Break Point Registers 


• 
Complete 
Intel Development 
Support 
- 
C, PL/M, 
Assembler 
Translators 
-ICETM-376, 
In-Circuit 
Emulator 
- 
iRMK Real Time Kernel 


• 
Extensive 
Third-Party 
Support: 
- 
Software: 
C, Pascal, FORTRAN, 
BASIC and ADA * 
-Hosts: 
VMS*, 
UNIX*, 
MS-DOS*, 
and 
Others 
- 
Real-Time 
Kernels 


• 
High Speed CHMOS Technology 


• 
Available 
in 100 Pin Plastic Quad Flat- 
Pack Package 
and 88-Pin Pin Grid Array 
(See Packaging 
Outlines 
and Dimensions 
#231369) 


The 376 32-bit 
embedded 
processor 
is designed 
for high performance 
embedded 
systems. 
It provides 
the 
performance 
benefits 
of a highly pipelined 
32-bit internal architecture 
with the low system cost associated 
with 
16-bit hardware 
systems. 
The 80376 is based on the 80386 and offers a high degree of compatibility 
with the 
80386. 
All 80386 
32-bit 
programs 
not dependent 
on paging 
can be executed 
on the 80376 
and all 80376 
programs 
can be executed 
on the 80386. 
All 32-bit 
80386 
language 
translators 
can be used for software 
development. 
With 
proper 
support 
software, 
any 80386-based 
computer 
can be used to develop 
and test 
80376 programs. 
In addition, 
any 80386-based 
PC-AT" 
compatible 
computer 
can be used for hardware 
proto- 
typing for designs 
based on the 80376 and its companion 
product 
the 82370. 


Execution 
Unit 
lAlAU 


32-Bft 
Registers 
Protection 


6'-81t 
80rrel 
Segment 
Shlfter 
Registers 


, 
Control 


llluitiply/Divide 
Segment 
Transistor 
~ 
ALU 
--. 


II 


Bus 
Interfoce 
¥- 


32-81t 
Data 
Path 
Unit 
-¥- 
I I 


Decoder - 


Pr.fetch 
Queue 
- 
Instruction 
Pr.f.tcher 
Queue 


Pre fetch 
Unit 


'UNIX 
is a registered 
trademark 
of AT&T. 


ADA is a registered 
trademark 
of the U.S. Government, 
Ada Joint 
Program 
Office. 
PC-AT is a registered 
trademark 
of IBM Corporation. 
VMS is a trademark 
of Digital 
Equipment 
Corporation. 


MS-DOS 
is a trademark 
of MicroSoft 
Corporation. 


A Row 
BRow 
CRow 
DRow 


Pin 
Label 
Pin 
Label 
Pin 
Label 
Pin 
Label 


1 
00 
26 
LOCK# 
51 
A2 
76 
A21 


2 
Vss 
27 
N/C 
52 
A3 
77 
Vss 


3 
HLOA 
28 
N/C 
53 
A4 
78 
Vss 
4 
HOLO 
29 
N/C 
54 
A5 
79 
A22 


5 
Vss 
30 
N/C 
55 
As 
80 
A23 


6 
NA# 
31 
N/C 
56 
A7 
81 
015 


7 
REAOY# 
32 
Vcc 
57 
Vcc 
82 
014 
8 
Vcc 
33 
RESET 
58 
As 
83 
013 
9 
Vcc 
34 
BUSY# 
59 
A9 
84 
Vcc 
10 
Vcc 
35 
Vss 
60 
A10 
85 
Vss 
11 
Vss 
36 
ERROR# 
61 
A11 
86 
012 
12 
Vss 
37 
PEREa 
62 
A12 
87 
011 
13 
Vss 
38 
NMI 
63 
Vss 
88 
010 
14 
Vss 
39 
Vcc 
64 
A13 
89 
09 
15 
CLK2 
40 
INTR 
65 
A14 
90 
Os 
16 
AOS# 
41 
Vss 
66 
A15 
91 
Vcc 
17 
BLE# 
42 
Vcc 
67 
Vss 
92 
07 
18 
A1 
43 
N/C 
68 
Vss 
93 
Os 
19 
BHE# 
44 
N/C 
69 
Vcc 
94 
05 
20 
N/C 
45 
N/C 
70 
A1S 
95 
04 
21 
Vcc 
46 
N/C 
71 
Vcc 
96 
03 
22 
Vss 
47 
N/C 
72 
An 
97 
Vcc 
23 
M/IO# 
48 
Vcc 
73 
A1S 
98 
Vss 
24 
O/C# 
49 
Vss 
74 
A19 
99 
02 
25 
W/R# 
50 
Vss 
75 
A20 
100 
01 


inter 
80376 
~[Q)W~OO©~ 
OOOlP@OOIMl~'irO@OO 


Top View 
Bottom View 
(Component Side) 
(Pin Side) 
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0 
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0 
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Vec 
v" 
'/c " 
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READY' 
HOLD 
DO 
D2 
v" 
Vec 
v" 
Vec 
Vec 
v" 
Vet 
v" 
" 
Do 
HOLD RLt.OY, 
4DS' 
A, 
'/c 
v" 
Vec 
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0 
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0 
0 
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0 
0 
0 
0 
0 
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0 
0 
0 
0 
0 
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0 
0 
0 
0 
0 
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v" 
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BHEI 
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CLI<2 
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HLOA 
D1 
", 
v" 
Vec 
v" 
v" 
Vec 
v" 
0, 
0, 
HLOA 
HA, 
CLK2 
BLEI 
SHE, 
lr,I/lO, 
'Ice 
v" 
, 
0 
0 
0 
0 , 
OJ 
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0 
0 
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" 
Vec 
Vet 
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ole, 
Vet 
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0 
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0 
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0 


'Ice 
LOClC.I 
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Figure 1.2.a0376 aa-Pin Grid Array Pin Out 


Table 1.2.aa-Pln Grid Array Pin Assignments 


Pin 
Label 
Pin 
Label 
Pin 
Label 
Pin 
Label 


2H 
ClK2 
120 
A1S 
2l 
M/IO# 
11A 
Vcc 
98 
015 
12E 
A17 
5M 
lOCK # 
13A 
Vcc 
8A 
014 
13E 
A1S 
1J 
AOS# 
13C 
Vcc 
88 
013 
12F 
A15 
1H 
REAOY# 
13l 
Vcc 
7A 
012 
13F 
A14 
2G 
NA# 
1N 
Vcc 
78 
011 
12G 
A13 
1G 
HOLD 
13N 
Vcc 
6A 
010 
13G 
A12 
2F 
HlOA 
118 
Vss 
68 
Og 
13H 
An 
7N 
PEREa 
2C 
Vss 
5A 
Os 
12H 
A10 
7M 
8USY# 
10 
Vss 
58 
07 
13J 
Ag 
8N 
ERROR# 
1M 
Vss 
48 
Os 
12J 
As 
9M 
INTR 
4N 
Vss 
4A 
05 
12K 
A7 
8M 
NMI 
9N 
Vss 
38 
04 
13K 
As 
6M 
RESET 
11N 
Vss 
20 
03 
12l 
A5 
28 
Vcc 
2A 
Vss 
1E 
02 
12M 
A4 
128 
Vcc 
12A 
Vss 
2E 
01 
11M 
A3 
1C 
Vcc 
18 
Vss 


1F 
Do 
10M 
A2 
2M 
Vcc 
138 
Vss 
9A 
A23 
1K 
A1 
3N 
Vcc 
13M 
Vss 
10A 
A22 
2J 
8lE# 
5N 
Vcc 
2N 
Vss 
108 
A21 
2K 
8HE# 
10N 
Vcc 
6N 
Vss 
12C 
A20 
4M 
W/R# 
1A 
Vcc 
12N 
Vss 
130 
A19 
3M 
O/C# 
3A 
Vcc 
1l 
N/C 


The following 
table 
lists a brief description 
of each 
pin on the 80376. 
The following 
definitions 
are used in 
these 
descriptions: 


# 
The named signal is active 
LOW. 


I 
Input signal. 


o 
Output 
signal. 


1/0 
Input and Output 
signal. 


No electrical 
connection. 


Symbol 
Type 
Name and Function 


CLK2 
I 
CLK2 
provides 
the fundamental 
timing 
for the 80376. 
For additional 
information 
see Clock (page 33). 


RESET 
I 
RESET suspends 
any operation 
in progress 
and places the 80376 in a 
known 
reset 
state. 
See 
Interrupt 
Signals 
(page 
38) for 
additional 
information. 


D15-DO 
1/0 
DATA BUS inputs data during memory, 
1/0 and interrupt 
acknowledge 
read cycles and outputs data during memory 
and 1/0 write cycles. See 
Data Bus (page 34) for additional 
information. 


A23-Al 
0 
ADDRESS 
BUS outputs 
physical 
memory 
or port 1/0 addresses. 
See 
Address 
Bus (page 34) for additional 
information. 


W/R# 
0 
WRITE/READ 
is a bus cycle 
definition 
pin that 
distinguishes 
write 
cycles from read cycles. See Bus Cycle Definition 
Signals (page 35) 
for additional 
information. 


D/C# 
0 
DATA/CONTROL 
is a bus cycle definition 
pin that distinguishes 
data 
cycles, 
either 
memory 
or I/O, from control 
cycles which 
are: interrupt 
acknowledge, 
halt, and instruction 
fetching. 
See Bus Cycle Definition 
Signals (page 35) for additional 
information. 


M/IO# 
0 
MEMORY 
I/O 
is a bus cycle definition 
pin that distinguishes 
memory 
cycles 
from 
input/output 
cycles. 
See Bus Cycle 
Definition 
Signals 
(page 35) for additional 
information. 


LOCK# 
0 
BUS 
LOCK 
is a bus 
cycle 
definition 
pin that 
indicates 
that 
other 
system 
bus masters 
are denied 
access 
to the system 
bus while 
it is 
active. 
See 
Bus Cycle 
Definition 
Signals 
(page 
35) for 
additional 
information. 


ADS# 
0 
ADDRESS 
STATUS 
indicates 
that 
a valid 
bus cycle 
definition 
and 
address (W/R#, 
D/C#, 
M/IO#, 
BHE#, 
BLE# 
and A23-Al) 
are being 
driven 
at the 
80376 
pins. 
See 
Bus Control 
Signals 
(page 
35) for 


l 
additional 
information. 


NA# 
I 
NEXT 
ADDRESS 
is used 
to 
request 
address 
pipelining. 
See 
Bus 
, 
Control 
Signals 
(page 35) for additional 
information. 


READY# 
I 
BUS 
READY 
terminates 
the 
bus 
cycle. 
See 
Bus Control 
Signals 
(page 35) for additional 
information. 


BHE#, 
BLE# 
0 
BYTE ENABLES 
indicate 
which data bytes of the data bus take part in 
a bus cycle. See Address 
Bus (page 34) for additional 
information. 


HOLD 
I 
BUS HOLD 
REQUEST 
input 
allows 
another 
bus master 
to request 
control 
of the local bus. See Bus Arbitration 
Signals 
(page 36) for 
additional 
information. 


Symbol 
Type 
Name and Function 


HLDA 
a 
BUS 
HOLD 
ACKNOWLEDGE 
output 
indicates 
that 
the 
80376 
has 
surrendered 
control 
of its local bus to another 
bus master. 
See Bus 
Arbitration 
Signals 
(page 36) for additional 
information. 


INTR 
I 
INTERRUPT 
REQUEST 
is a maskable 
input that signals the 80376 to 
suspend 
execution 
of the current 
program 
and execute 
an interrupt 
acknowledge 
function. 
See Interrupt 
Signals 
(page 38) for additional 
information. 


NMI 
I 
NON·MASKABLE 
INTERRUPT 
REQUEST 
is a non-maskable 
input 
that signals 
the 80376 
to suspend 
execution 
of the current 
program 
and execute 
an interrupt 
acknowledge 
function. 
See Interrupt 
Signals 
(page 38) for additional 
information. 


BUSY# 
I 
BUSY 
signals 
a busy 
condition 
from 
a processor 
extension. 
See 
Coprocessor 
Interface 
Signals 
(page 37) for additional 
information. 


ERROR# 
I 
ERROR 
signals 
an error 
condition 
from 
a processor 
extension. 
See 
Coprocessor 
Interface 
Signals 
(page 37) for additional 
information. 


PEREa 
I 
PROCESSOR 
EXTENSION 
REQUEST 
indicates 
that 
the 
processor 
extension 
has data to be transferred 
by the 80376. See Coprocessor 
Interface 
Signals 
(page 37) for- additional 
information. 


N/C 
- 
NO CONNECT 
should 
always 
remain 
unconnected. 
Connection 
of a 
N/C 
pin may cause 
the processor 
to malfunction 
or be incompatible 
with future steppings 
of the 80376. 


Vcc 
I 
SYSTEM 
POWER 
provides 
the + 5V nominal 
D.C. supply 
input. 


Vss 
I 
SYSTEM 
GROUND 
provides 
OV connection 
from which 
all inputs and 
outputs 
are measured. 


The 
80376 
supports 
the 
protection 
mechanisms 
needed 
by 
sophisticated 
multitasking 
embedded 
systems 
and real-time 
operating 
systems. 
The use 
of these 
protection 
mechanisms 
is completely 
op- 
tional. 
For embedded 
applications 
not needing 
pro- 
tection, 
the 80376 
can easily 
be configured 
to pro- 
vide. a 16 Mbyte physical 
address 
space. 


Instruction 
pipelining, 
high 
bus 
bandwidth, 
and 
a 
very 
high 
performance 
ALU 
ensure 
short 
average 
instruction 
execution 
times 
and 
high 
system 
throughput. 
The 
80376 
is capable 
of execution 
at 
sustained 
rates 
of 2.5-3.0 
million 
instructions 
per 
second. 


The 80376 
offers 
on-chip 
testability 
and debugging 
features. 
Four break point registers 
allow conditional 
or unconditional 
break point traps on code execution 
or data 
accesses 
for powerful 
debugging 
of even 
ROM 
based 
systems. 
Other 
testability 
features 
in- 
clude self-test 
and tri-stating 
of output buffers during 
RESET. 


The Intel 80376 
embedded 
processor 
consists 
of a 
central 
processing 
unit, a memory 
management 
unit 
and a bus interface. 
The central 
processing 
unit con- 


sists of the execution 
unit and instruction 
unit. The 
execution 
unit contains 
the eight 32-bit general 
reg- 
isters 
which' are used for both 
address 
calculation 
and data operations 
and a 64-bit barrel shifter 
used 
to speed shift, rotate, multiply, and divide operations. 
The instruction 
unit decodes 
the instruction 
opcodes 
and stores 
them 
in the decoded 
instruction 
queue 
for immediate 
use by the execution 
unit. 


The Memory 
Management 
Unit (MMU) consists 
of a 
segmentation 
and protection 
unit. Segmentation 
al- 
lows the managing 
of the logical 
address 
space 
by 
providing 
an extra addressing 
component, 
one that 
allows 
easy 
code 
and data 
relocatability, 
and effi- 
cient sharing. 


The protection 
unit provides 
four levels of protection 
for isolating 
and protecting 
applications 
and the op- 
erating 
system 
from 
each 
other. 
The hardware 
en- 
forced 
protection 
allows 
the design 
of systems 
with 
a high degree 
of integrity 
and simplifies 
debugging. 


Finally, 
to facilitate 
high performance 
system 
hard- 
ware 
designs, 
the 
80376 
bus 
interface 
offers 
ad- 
dress 
pipelining 
and direct 
Byte Enable 
signals 
for 
each byte of the data bus. 


inter 


The 80376 has twenty-nine 
registers 
as shown in Figure 2.1. These registers 
are grouped 
into the following 
six 
categories: 


AH 
Ix 
AL 


BH 
!Ix 
BL 


CH 
(~ 
CL 


. 
OH 
[ 
OL 


SI 


01 


BP 


SP 


EAX 


EBX 


ECX 


EOX 
GENERAL PURPOSE 


ESI 
REGISTERS 


EOI 


EBP 


ESP 


CS 


SS 


OS 
SEGWENT 
ES 
REGISTERS 


FS 


GS 


o 
I 


ErLAGS 
] 
rLAGS 
AND 
, 
INSTRUCTION 
EIP 
POINTER 


o 
I CRO 


16 15 


I 


o 


I 


GOTR 
] 
10TR 
5YSTEW ADDRESS 
LOTR 
REGISTERS 


TR 


LINEAR BREAKPOINT ADDRESS 0 


LINEAR BREAKPOINT AOORESS 1 


LINEAR BREAKPOINT ADDRESS 2 


LINEAR BREAKPOINT ADDRESS 3 


~ 
BREAKPOINT STATUS 


BREAKPOINT CONTROL 


ORO 


OR1 
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General Registers: The eight 32-bit general pur- 
pose registers are used to contain arithmetic and 
logical operands. Four of these (EAX, EBX, ECXand 
EDX) can be used either in their entirety as 32-bit 
registers, as 16·bit registers, or split into pairs of 
separate 8-bit registers. 


Segment Registers: Six 16·bit special purpose reg· 
isters select. at any given time, the segments of 
memory that are immediately addressable for code, 
stack, and data. 


Flags and Instruction Pointer Registers: These 
two 32-bit special purpose registers in Figure 2.1 
record or control certain aspects of the 80376 proc- 
essor state. The EFLAGS register includes status 
and control bits that are used to reflect the outcome 
of many instructions and modify the semantics of 
some instructions. The Instruction Pointer, called 
EIP, is 32 bits wide. The Instruction Pointer controls 
instruction fetching and the processor automatically 
increments it after executing an instruction. 


Control Register: The 32·bit control register. CRO, 
is used to control Coprocessor Emulation. 


SPECIAL 
FIELDS: 


I/o 
PRIVilEGE 
LEVEL 


NESTED 
TASK 


System Address Registers: These four special 
registers reference the tables or segments support· 
ed by the 80376/80386 protection model. These ta· 
bles or segments are: 


GDTR (Global Descriptor Table Register). 
IDTR (Interrupt Descriptor Table Register), 
LDTR (Local Descriptor Table Register). 
TR (Task State Segment Register). 


Debug Registers: The six programmer accessible 
debug registers provide on-chip support for debug- 
ging. The use of the debug registers is described in 
Section 2.11 Debugging Support. 


The 
flag 
Register 
is 
a 
32·bit 
register 
named 
EFLAGS. The defined bits and bit fields within 
EFLAGS, shown in Figure 2.2, control certain opera- 
tions and indicate the status of the 80376 processor. 
The function of the flag bits is given in Table 2.1. 


OVERFLOW 


SIGN 


ZERO 


.lUX 
CARRY 


PARITY 


CARRY 


TRAP 


INTERRUPT 


DIRECTION 


RESUNE 


MONITOR 
COPROCESSOR 


EWULATE 
COPROCESSOR 


TASK 
SWITCHED 
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Bit Position 
Name 
Function 
0 
CF 
Carry Flag-Set 
on high-order bit carry or borrow; cleared otherwise. 


2 
PF 
Parity Flag-Set 
if low-order 8 bits of result contain an even number 
of 1-bits; cleared otherwise. 


4 
AF 
Auxiliary 
Carry Flag-Set 
on carry from or borrow to the low order 
four bits of AL; cleared otherwise. 
6 
ZF 
Zero Flag-Set 
if result is zero; cleared otherwise. 
7 
SF 
Sign Flag-5et 
equal to high-order bit of result (0 if positive, 1 if 
negative). 
8 
TF 
Single 
Step Flag-Qnce 
set, a single step interrupt occurs after the 
next instruction executes. TF is cleared by the single step interrupt. 


9 
IF 
Interrupt-Enable 
Flag-When 
set, external interrupts signaled on the 
INTR pin will cause the CPU to transfer control to an interrupt vector 
specified location. 
10 
OF 
Direction 
Flag-Causes 
string instructions to auto-increment (default) 
the appropriate index registers when cleared. Setting OF causes auto- 
decrement. 


11 
OF 
Overflow 
Flag-Set 
if the operation resulted in a carry/borrow into 
the sign bit (high-order bit) of the result but did not result in a 
carry/borrow out of the high-order bit or vice-versa. 
12,13 
10PL 
1/0 
Privilege 
Level-Indicates 
the maximum CPL permitted to 
execute I/O instructions without generating an exception 13 fault or 
consulting the I/O permission bit map. It also indicates the maximum 
CPL value allowing alteration of the IF bit. 


14 
NT 
Nested 
Task-Indicates 
that the execution of the current task is 
nested within another task (see Task Switching). 


16 
RF 
Resume Flag-Used 
in conjunction with debug register breakpoints. It 
is checked at instruction boundaries before breakpoint processing. If 
set, any debug fault is ignored on the next instruction. It is reset at the 
successful completion of any instruction except IRET, POPF, and 
those instructions causing task switches. 


The 80376 has a 32-bit control register called CROthat is used to control coprocessor emulation. This register 
is shown in Figures 2.1 and 2.2. The defined CRObits are described in Table 2.2. 


Table 2.2. CRO Definitions 


Bit Position 
Name 
Function 
1 
MP 
Monitor 
Coprocessor 
Extension-Allows 
WAIT instructions to cause 
a processor extension not present exception (number 7). 
2 
EM 
Emulate 
Processor 
Extension-When 
set, 
this 
bit 
causes 
a 
processor extension not present exception (number 7) on ESC 
instructions to allow processor extension emulation. 
3 
TS 
Task Switched-When 
set, this bit indicates the next instruction using 
a processor extension will cause exception 7, allowing software to test 
whether the current processor extension context belongs to the 
current task (see Task SWitching). 


inter 


The instruction set is divided into nine categories of 
operations: 


Data Transfer 
Arithmetic 
Shift/Rotate 
String Manipulation 
Bit Manipulation 
Control Transfer 
High Level Language Support 
Operating System Support 
Processor Control 


These 80376 processor instructions are listed in Ta- 
ble 8.1 80376 
Instruction 
Set 
and 
Clock 
Count 
Summary. 


All 80376 processor instructions operate on either 0, 
1, 2 or 3 operands; an operand resides in a register, 
in the instruction itself, or in memory. Most zero op- 
erand instructions (e.g. CLI, STI) take only one byte. 
One operand instructions generally are two bytes 
long. The average instruction is 3.2 bytes long. 
Since the 80376 has a 16-byte prefetch instruction 
queue an average of 5 instructions can be pre- 
fetched. The use of two operands permits the follow- 
ing types of common instructions: 


Register to Register 
Memory to Register 
Immediate to Register 
Memory to Memory 
Register to Memory 
Immediate to Memory 


The operands are either 8-, 16- or 32-bit long. 


2.3 Memory Organization 


Memory on the 80376 is divided into 8-bit quantities 
(bytes), 16-bit quantities (words), and 32-bit quanti- 
ties (dwords). Words are stored in two consecutive 
bytes in memory with the low-order byte at the low- 
est address. Dwords are stored in four consecutive 
bytes in memory with the low-order byte at the low- 
est address. The address of a word or Dword is the 
byte address of the low-order byte. 


In addition to these basic data types the 80376 proc- 
essor supports segments. Memory can be divided 
up into one or more variable length segments, which 
can be shared between programs. 


ADDRESS 
SPACES 


The 80376 has three types of address spaces: 
logical, 
linear, 
and physical. 
A logical 
address 
(also known as a virtual address) consists of a se- 
lector and an offset. A selector is the contents of a 
segment register. An offset is formed by summing all 
of the addressing components (BASE, INDEX, and 
DISPLACEMENT), 
discussed 
in 
Section 
2.4 
Addressing 
Modes, into an effective address. 


Every selector has a logical base address associat- 
ed with it that can be up to 32 bits in length. This 32- 
bit logical 
base address is added to either a 32-bit 
offset address or a 16-bit offset address {by using 
the address length prefix )to form a final 32-bit 
linear address. This final linear address is then trun- 
cated so that only the lower 24 bits of this address 
are used to address the 16 Mbytes physical memory 
address space. The logical 
base address is stored 
in one of two operating system tables (Le. the Local 
Descriptor Table or Global Descriptor Table). 


Figure 2.3 shows the relationship between the vari- 
ous address spaces. 
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Figure 2.3. Address 
Translation 


ister is used. The segment register is automatically 
chosen according to the rules of Table 2.3 (Segment 
Register Selection Rules). In general. data refer- 
ences use the selector contained in the OS register, 
stack references use the SS register and instruction 
fetches use the CS register. The contents of the In- 
struction Pointer provide the offset. Special segment 
override prefixes allow the explicit use of a given 
segment register, and override the implicit rules list- 
ed in Table 2.3. The override prefixes also allow the 
use of the ES, FS and GS segment registers. 


The main data structure used to organize memory is 
the segment. On the 80376, segments are variable 
sized blocks of linear addresses which have certain 
attributes associated with them. There are two main 
types of segments, code and data. The simplest use 
of segments is to have one code and data segment. 
Each segment is 16 Mbytes in size overlapping each 
other. This allows code and data to be directly ad- 
dressed by the same offset. 


In order to provide compact instruction encoding 
and increase processor performance, instructions 
do not need to explicitly specify which segment reg- 


There are no restrictions regarding the overlapping 
of the base addresses of any segments. Thus, all 6 
segments could have the base address set to zero. 
Further details of segmentation are discussed in 
Section 3.0 Architecture. 


Type 
of 
Implied (Default) 
Segment 
Override 
Memory 
Reference 
Segment 
Use 
Prefixes 
Possible 


Code Fetch 
CS 
None 


Destination 
of PUSH, PUSHF, INT, 
SS 
None 
CALL, PUSHA Instructions 


Source of POP, POPA, POPF, IRET, 
SS 
None 
RET Instructions 


Destination 
of STOS, 


MOVS, REP STOS, 
ES 
None 
REP MOVS Instructions 
(01 is Base Register) 


Other Data References, 
with Effective 
Address 
Using Base Register 
of: 


[EAX] 
OS 
CS, SS, ES, FS, GS 
[EBX] 
OS 
CS, SS, ES, FS, GS 
[ECX] 
OS 
CS, SS, ES, FS, GS 
[EDX] 
OS 
CS, SS, ES, FS, GS 
[ESI] 
OS 
CS, SS, ES, FS, GS 
[EDI] 
OS 
CS, SS, ES, FS, GS 
[EBP] 
SS 
CS, SS, ES, FS, GS 
[ESP] 
SS 
CS, SS, ES, FS, GS 


2.4 Addressing 
Modes 


The 80376 
provides 
a total 
of 8 addressing 
modes 


for instructions 
to specify 
operands. 
The addressing 
modes are optimized 
to allow the efficient 
execution 
of high level languages 
such as C and FORTRAN, 


and they cover the vast majority 
of data references 
needed 
by high-level 
languages. 


Two 
of the 
addressing 
modes 
provide 
for instruc- 
tions 
that 
operate 
on register 
or immediate 
oper- 
ands: 


Register 
Operand 
Mode: The operand 
is located 
in 
one of the 8-, 16- or 32-bit general 
registers. 


Immediate 
Operand 
Mode: The operand 
is includ- 
ed in the instruction 
as part of the opcode. 


The 
remaining 
6 modes 
provide 
a mechanism 
for 
specifying 
the effective 
address 
of an operand. 
The 
linear address 
consists 
of two components: 
the seg- 


ment 
base address 
and an effective 
address. 
The 
effective 
address 
is 
calculated 
by 
summing 
any 
combination 
of the following 
three address 
elements 


(see Figure 2.3): 


DISPLACEMENT: 
an 8-, 16- or 32-bit immediate 
val- 


ue following 
the instruction. 


BASE: The contents 
of any general 
purpose 
regis- 


ter. The base registers 
are generally 
used by compil- 
ers to point 
to the start of the 
local variable 
area. 


Note that if the Address 
Length Prefix is used, only 


BX and BP can be used as a BASE register. 


INDEX: The contents 
of any general 
purpose 
regis- 


ter except 
for ESP. The index registers 
are used to 
access the elements 
of an array, or a string of char- 


acters. 
The index register's 
value can be multiplied 
by a scale factor, either 1, 2, 4 or 8. The scaled index 
is especially 
useful 
for 
accessing 
arrays 
or struc- 


tures. 
Note 
that 
if the Address 
Length 
Prefix 
is 
used, no Scaling 
is available 
and only the registers 
SI and 01 can be used to INDEX. 


~VI 
[, 
QUUI 
Q,;,.alll~ 
IllUUv~. 
I I n:::a t;: 
I;:) IIU 
"'~IIUIIII- 
ance penalty for using any of these addressing com- 
binations, since the effective address calculation is 
pipelined with the execution of other instructions. 
The one exception is the simultaneous use of BASE 
and INDEX components which requires one addi- 
tional clock. 


As shown in Figure 2.4, the effective address (EA) of 
an operand is calculated according to the following 
formula: 


EA = BASERejlister+ (INDEXRegisterXscaling)+ 
DISPLACEMENT 
1. Direct Mode: The operand's offset is contained 
as part of the instruction as an 8-, 16- or 32-bit 
DISPLACEMENT. 
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3. Based Mode: A BASE register's contents is add- 
ed to a DISPLACEMENT to form the operand's 
offset. 


4. Scaled Index Mode: An INDEX register's con- 
tents is multiplied by a SCALING factor which is 
added to a DISPLACEMENT to form the oper- 
and's offset. 


5. Based Scaled Index Mode: The contents of an 
INDEX register is multiplied by a SCALING factor 
and the result is added to the contents of a BASE 
register to obtain the operand's offset. 


6. Based Scaled Index Mode with Displacement: 
The contents of an INDEX register are multiplied 
by a SCALING factor, and the result is added to 
the contents of a BASE register and a DISPLACE- 
MENT to form the operand's offset. 
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biers. The Operand 
Length and Address 
Length Pre- 
fixes can be applied 
separately 
or in combination 
to 
any instruction. 
The 80376 
executes 
code with a default 
length 
for 
operands 
and addresses 
of 32 bits. The 
80376 
is 
also able to execute 
operands 
and addresses 
of 16 
bits. This 
is specified 
through 
the 
use of override 
prefixes. 
Two prefixes, 
the Operand 
Length 
Prefix 
and the Address 
Length 
Prefix, 
override 
the de- 
fault 32-bit 
length 
on an individual 
instruction 
basis. 


These 
prefixes 
are automatically 
added 
by assem- 


The 80376 
normally 
executes 
32-bit code and uses 


either 
8- or 32-bit 
displacements, 
and any register 


can be used as based or index registers. 
When exe- 
cuting 16-bit code (by prefix overrides), 
the displace- 


ments are either 8 or 16 bits, and the base and index 
register 
conform 
to the 16-bit model. Table 2.4 illus- 
trates the differences. 


16-Bit Addressing 
32-Bit Addressing 


BASE REGISTER 
BX,BP 
Any 32-Bit GP Register 


INDEX REGISTER 
SI,DI 
Any 32-Bit GP Register 
except ESP 


SCALE FACTOR 
None 
1,2,4,8 


DISPLACMENT 
0,8,16 
Bits 
0,8,32 
Bits 


The 80376 supports 
all of the data types commonly 
used in high level languages: 


Bit: 
A single bit quantity. 


Bit Field: 
A group of up to 32 contiguous 
bits, which spans a maximum 
of four 
bytes. 


A set of contiguous 
bits, on the 80376 bit strings can be up to 16 Mbits 
long. 


A signed 8-bit quantity. 


An unsigned 
8-bit quantity. 


A signed 16-bit quantity. 


A signed 32-bit quantity. All operations 
assume a 2's complement 
representation. 


An unsigned 
16-bit quantity. 


Byte: 


Unsigned 
Byte: 


Integer (Word): 


Long Integer (Double Word): 


Unsigned 
Integer (Word): 


Unsigned 
Long Integer 


(Double Word): 


Signed Quad Word: 


Unsigned 
Quad Word: 


Pointer: 


Char: 


String: 


BCD: 


Packed BCD: 


An unsigned 
32-bit quantity. 


A signed 64-bit quantity. 


An unsigned 
64-bit quantity. 


A 16- or 32-bit offset only quantity which indirectly 
references 
another 
memory location. 


A full pointer which consists 
of a 16-bit segment 
selector 
and either a 
16- or 32-bit offset. 


A byte representation 
of an ASCII Alphanumeric 
or control 
character. 


A contiguous 
sequence 
of bytes, words or dwords. A string may 
contain 
between 
1 byte and 16 Mbytes. 


A byte (unpacked) 
representation 
of decimal 
digits 0-9. 


A byte (packed) 
representation 
of two decimal 
digits 0-9 
storing one 
digit in each nibble. 


When the 80376 is coupled with a numerics Coprocessor such as the 80387SX then the following 
common Floating Point types are supported. 
Floating Point: 
A signed 32-, 64- or 80-bit real number representation. Floating point 
numbers are supported by the 80387SX numerics coprocessor. 
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The 
80376 
has 
two 
distinct 
physical 
address 
spaces: physical memory and I/O. Generally, pe- 
ripherals are placed in I/O 
space although the 
80376 also supports memory-mapped peripherals. 
The I/O space consists of 64 Kbytes which can be 
divided into 64K 8-bit ports, 32K 16-bit ports, or any 
combination of ports which add to no more than 64 
Kbytes. The M/IO# 
pin acts as an additional ad- 
dress line, thus allowing the system designer to easi- 
ly determine which address space the processor is 
accessing. Note that the I/O address refers to a 
physical address. 


The I/O ports are accessed by the IN and OUT in- 
structions, with the port address supplied as an im- 
mediate 8-bit constant in the instruction or in the DX 
register. All 8-bit and 16-bit port addresses are zero 
extended on the upper address lines. The I/O in- 
structions cause the M/IO# 
pin to be driven LOW. 
I/O port addresses 00F8H through OOFFHare re- 
served for use by Intel. 


Interrupts and exceptions alter the normal program 
flow in order to handle external events, report errors 
or exceptional conditons. The difference between in- 
terrupts and exceptions is that interrupts are used to 
handle asynchronous external events while excep- 
tions handle instruction faults. Although a program 
can generate a software interrupt via an INT N in- 
struction, the processor treats software interrupts as 
exceptions. 


Hardware interrupts occur as the result of an exter- 
nal event and are classified into two types: maskable 
or non-maskable. Interrupts are serviced after the 
execution of the current instruction. After the inter- 
rupt handler is finished servicing the interrupt, exe- 
cution proceeds with the instruction immediately 
after the interrupted instruction. 


Exceptions are classified as faults, traps, or aborts 
depending on the way they are reported, and wheth- 
er or not restart of the instruction causing the excep- 
tion is suported. Faults are exceptions that are de- 
tected and serviced before 
the execution of the 
faulting instruction. Traps 
are exceptions that are 
reported immediately after the execution of the in- 
struction which caused the problem. Aborts are ex- 
ceptions which do not permit the precise location of 
the instruction causing the exception to be deter- 
mined. Thus, when an interrupt serVice routine has 
been completed, execution proceeds from the in- 


struction immediately following the interrupted in- 
struction. On the other hand the return address from 
an exception/fault 
routine will always point at the 
instruction causing the exception and include any 
leading instruction prefixes. Table 2.5 summarizes 
the possible interrupts for the 80376 and shows 
where the return address points to. 


The 80376 has the ability to handle up to 256 differ- 
ent interrupts/exceptions. In order to service the in- 
terrupts, a table with up to 256 interrupt vectors 
must be defined. The interrupt vectors are simply 
pointers to the appropriate interrupt service routine. 
The interrupt vectors are 8-byte quantities, which are 
put in an Interrupt Descriptor Table. Of the 256 pos- 
sible interrupts, 32 are reserved for use by Intel and 
the remaining 224 are free to be used by the system 
designer. 


When an interrupt occurs the following actions hap- 
pen. First, the current program address and the 
Flags are saved on the stack to allow resumption of 
the interrupted program. Next, an 8-bit vector is sup- 
plied to the 80376 which identifies the appropriate 
entry in the interrupt table. The table contains either 
an Interrupt Gate, a Trap Gate or a Task Gate that 
will point to an interrupt procedure or task. The user 
supplied interrupt service routine is executed. Final- 
ly, when an IRET instruction is executed the old 
processor state is restored and program execution 
resumes at the appropriate instruction. 


The 8-bit interrupt vector is supplied to the 80376 in 
several different ways: exceptions supply the inter- 
rupt vector internally; software INT instructions con- 
tain or imply the vector; maskable hardware inter- 
rupts supply the 8-bit vector via the interrupt ac- 
knowledge bus sequence. Non-Maskable hardware 
interrupts are assigned to interrupt vector 2. 


Maskable 
Interrupt 


Maskable interrupts are the most common way to 
respond to asynchronous external hardware events. 
A hardware interrupt occurs when the INTR is pulled 
HIGH and the Interrupt Flag bit (IF) is enabled. The 
processor only responds to interrupts between in- 
structions (string instructions have an "interrupt win- 
dow" between memory moves which allows inter- 
rupts during long string moves). When an interrupt 
occurs the processor reads an 8-bit vector supplied 
by the hardware which identifies the source of the 
interrupt (one of 224 user defined interrupts). 


Instruction 
Which 
Return Address 


Function 
Interrupt 
Can Cause 
Points to 
Type 
Number 
Faulting 
Exception 
Instruction 


Divide Error 
0 
DIV,IDIV 
Yes 
FAULT 


Debug Exception 
1 
Any Instruction 
Yes 
TRAP' 


NMllnterrupt 
2 
INT20r NMI 
No 
NMI 


One-Byte Interrupt 
3 
INT 
No 
TRAP 


Interrupt on Overflow 
4 
INTO 
No 
TRAP 


Array Bounds Check 
5 
BOUND 
Yes 
FAULT 


Invalid OP-Code 
6 
Any Illegal Instruction 
Yes 
FAULT 


Device Not Available 
7 
ESC,WAIT 
Yes 
FAULT 


Double Fault 
8 
Any Instruction That Can 
ABORT 
Generate an Exception 


Coprocessor Segment Overrun 
9 
ESC 
No 
ABORT 


InvalidTSS 
10 
JMP, CALL, IRET, INT 
Yes 
FAULT 


Segment Not Present 
11 
Segment Register Instructions 
Yes 
FAULT 


Stack Fault 
12 
Stack References 
Yes 
FAULT 


General Protection Fault 
13 
Any Memory Reference 
Yes 
FAULT 


Intel Reserved 
14-15 
- 
- 
- 


Coprocessor Error 
16 
ESC,WAIT 
Yes 
FAULT 


Intel Reserved 
" 
17-32 


Two-Byte Interrupt 
0-255 
INTn 
No 
TRAP 


Interrupts through Interrupt Gates automatically re- 
set IF, disabling INTR requests. Interrupts through 
Trap Gates leave the state of the IF bit unchanged. 
Interrupts through a Task Gate change the IF bit ac- 
cording to the image of the EFLAGs register in the 
task's Task State Segment (TSS). When an IRET 
instruction is executed, the original state of the IF bit 
is restored. 


Non-Maskable 
Interrupt 


Non-maskable interrupts provide a method of servic- 
ing very high priority interrupts. When the NMI input 
is pulled HIGH it causes an interrupt with an internal- 
ly supplied vector value of 2. Unlike a normal hard- 
ware interrupt no interrupt acknowledgement se- 
quence is performed for an NMI. 


While executing the NMI servicing procedure, the 
80376 will not service any further NMI request, or 
INT requests, until an interrupt return (IRET) instruc- 


tion is executed or the processor is reset. If NMI 
occurs while currently servicing an NMI, its presence 
will be saved for servicing after executing the first 
IRET instruction. The disabling of INTR requests de- 
pends on the gate in IDT location 2. 


Software 
Interrupts 


A third type of interrupt/exception for the 80376 is 
the software interrupt. An INT n instruction causes 
the processor to execute the interrupt service rou- 
tine pointed to by the nthvector in the interrupt table. 


A special case of the two byte software interrupt 
INT n is the one byte INT 3, or breakpoint interrupt. 
By inserting this one byte instruction in a program, 
the user can set breakpoints in his program as a 
debugging tool. 


inter 


A final type of software interrupt, is the single step 
interrupt. It is discussed in Single-Step 
Trap (page 
22). 


Interrupts are externally-generated events. Maska- 
ble Interrupts (on the INTR input) and Non-Maskable 
Interrupts (on the NMI input) are recognized at in- 
struction 
boundaries. When 
NMI and maskable 
INTR are both recognized at the same instruction 
boundary, the 80376 invokes the NMI service rou- 
tine first. If, after the NMI service routine has been 
invoked, maskable interrupts are still enabled, then 
the 80376 will invoke the appropriate interrupt serv- 
ice routine. 


As the 80376 executes instructions, it follows a con- 
sistent cycle in checking for exceptions, as shown in 
Table 2.6. This cycle is repeated as each instruction 
is executed, and occurs in parallel with instruction 
decoding and execution. 


The 80376 fully supports restarting all instructions 
after faults. If an exception is detected in the instruc- 
tion to be executed (exception categories 4 through 
9 in Table 2.6), the 80376 device invokes the appro- 
priate exception service routine. The 80376 is in a 
state that permits restart of the instruction. 


A Double fault (exception 8) results when the proc- 
essor attempts to invoke an exception service rou- 
tine for the segment exceptions (10, 11, 12 or 13), 
but in the process of doing so, detects an exception. 


When the processor is Reset the registers have the 
values shown in Table 2.7. The 80376 will then start 
executing instructions near the top of physical mem- 
ory, at location OFFFFFOH.A short JMP should be 
executed within the segment defined for power-up 
(see Table 2.7). The GOT should then be initialized 
for a start-up data and code segment followed by a 
far JMP that will load the segment descriptor cache 
with the new descriptor values. The lOT table, after 
reset, is located at physical address OH,with a limit 
of 256 entries. 
- 


RESET forces the 80376 to terminate all execution 
and local bus activity. No instruction execution or 
bus activity will occur as long as Reset is active. 
Between 350 and 450 CLK2 periods after Reset be- 
comes inactive, the 80376 will start executing in- 
strudions at the top of physical memory. 


Consider the case of the 80376 having just completed an instruction. It then performs the following checks 
before reaching the point where the next instruction is completed: 


1. Check for Exception 1 Traps from the instruction just completed (single-step via Trap Flag, or Data 
Breakpoints set in the Debug Registers). 


2. Check for external NMI and INTR. 
3. Check for Exception 1 Faults in the next instruction (Instruction Execution Breakpoint set in the 
Debug Registers for the next instruction). 


4. Check for Segmentation Faults that prevented fetching the entire next instruction (exceptions 11 or 
13). 
5. Check for Faults decoding the next instruction (exception 6 if illegal opcode; or exception 13 if 
instruction is longer than 15 bytes, or privilege violation (Le. not at IOPL or at CPL = 0). 
6. If WAIT opcode, check if TS = 1 and MP = 1 (exception 7 if both are 1). 
7. If ESCape opcode for numeric coprocessor, check if EM = 1 or TS = 1 (exception 7 if either are 1). 
8. If WAIT opcode or ESCape opcode for numeric coprocessor, check ERROR# input signal (excep- 
tion 16 if ERROR# input is asserted). 


9. Check for Segmentation Faults that prevent transferring the entire memory quantity (exceptions 11, 


12, 13). 
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Flag Word (EFLAGS) 
uuuuOO02H 
(Note 1) 


Machine Status Word (CRO) 
uuuuuuu1H 
(Note 2) 


Instruction Pointer (EIP) 
OOOOFFFOH 


Code Segment (CS) 
FOOOH 
(Note 3) 


Data Segment (OS) 
OOOOH 
(Note 4) 


Stack Segment (SS) 
OOOOH 


Extra Segment (ES) 
OOOOH 
(Note 4) 


Extra Segment (FS) 
OOOOH 


Extra Segment (GS) 
OOOOH 


EAX Register 
OOOOH 
(Note 5) 


EoX Register 
Component and Stepping 10 
(Note 6) 


All Other Registers 
Undefined 
(Note 7) 


NOTES: 
1. EFLAG 
Register. 
The upper 
14 bits of the EFLAGS 
register 
are undefined. 
all defined 
flag bits are zero. 
2. CRO: The defined 
4 bits in the CRO is equal to 1H. 


3. The Code Segment 
Register 
(CS) will have its Base Address 
set to OFFFFOOOOH and 
Limit set to OFFFFH. 
4. The Data and Extra Segment 
Registers 
(OS and ES) will have their Base Address 
set 
to OOOOOOOOOHand Limit set to OFFFFH. 
5. If self-test 
is selected. 
the EAX should 
contain 
a 0 value. 
If a value 
of 0 is not found 
the self-test 
has detected 
a flaw in the part. 


6. EDX register 
always 
holds component 
and stepping 
identifier. 
7. All unidentified 
bits are Intel Reserved 
and should 
not be used. 


Because the 80376 processor starts executing in protected 
mode, certain precautions 
need be taken during 
initialization. 
Before any far jumps can take place the GOT and/or 
LOT tables need to be setup and their 


respective registers loaded. Before interrupts can be initialized the lOT table must be setup and the IOTR must 
be loaded. The example code is shown below: 


This 
is 
an 
example 
of 
startup 
code 
to 
put 
either 
an 
80376, 
80386SX 
or 
80386 
into 
flat 
mode. 
All 
of 
memory 
is 
treated 
as 
simple 
linear 
RAM. There 
are 
no 
interrupt 
routines. 
The 
Builder 
creates 
the 
GDT-alias 
and 
IDT-alias 
and 
places 
them, 
by 
default, 
in 
GDT[l] 
and 
GDT[2]. 
Other 
entries 
in 
the 
GDT 
are 
specified 
in 
the 
Build 
file. 
After 
initialization 
it 
jumps 
to 
a C startup 
routine. 
To use 
this 
template, 
change 
this 
jmp 
address 
to 
that 
of 
your 
code, 
or 
make 
the 
label 
of 
your 
code 
"c_startup". 


This 
code 
was 
assembled 
and 
built 
using 
version 
1.2 
of 
the 
Intel 
RLL utilities 
and 
Intel 
386ASM assembler. 


inter 


pe_flag 
equ I 
data_selc 
equ 20h 
; assume code is GDT[3] , data GDT[4] 


start: 
cld 
smsw bx 
test bl,l 
jnz pestart 
realstart 
db 66h 
mov eax,offset 
gdt_desc 
xor ebx,ebx 
mov bh,ah 
move bl,al 
db 67h 
db 66h 
19dt cs: [ebx] 
smsw ax 
or al,pe_flag 
lmsw ax 
jmp next 
pestart: 
mov ebx,offset 
gdt_desc 
xor eax,eax 
mov aX,bx 
19dt cs:[eax] 
xor ebx,ebx 
mov bl,data_selc 
mov dS,bx 
mov sS,bx 
mov eS,bx 
mov fS,bx 
mov gS,bx 
jmp pejump 
next: 
xor ebx,ebx 
mov bl,data_selc 
mov dS,bx 
mov sS,bx 
mov eS,bx 
mov fS,bx 
mov gS,bx 
db 66h 
pejump: 
jmp far ptr c_startup 


org 70h 
jmp short start 
INIT_CODE ENDS 
END 


clear direction 
flag 
check for processor 
(80376) at reset 
use SMSW rather than MOV for speed 


is an 80386 and in real mode 
force the next operand into 32-bit mode. 
move address of the GDT descriptor 
into eax 
clear ebx 
load 8 bits of address into bh 
load 8 bits of address into bl 


use the 32-bit form of LGDT to load 
the 32-bits of address 
into the GDTR 
go into protected mode 
(set PE bit) 


lower portion 
of address only 


initialize data selectors 
GDT[3] 


initialize 
data selectors 
GDT[3] 


inter 


This code should be linked into your application 
for boot load able code. The following 
build file illustrates 
how 


this is accomplished. 


FLAT; 
-- build 
program 
id 


SEGMENT 


*segments 
(dpl=O), 
_phantom_code_ 
(dpl=O), 
_phantom_data_ 
(dpl=O), 
init_code 
(base=Offffff80h); 


GATE 
g13 
(entry:13, 
dpl=O, 
trap), 
132 
(entry=32, 
dpl=O, 
interrupt), 


TABLE 


ENTRY 
= 
(3:_phantom_code_, 
4:_phantom_data_, 
5:code32, 
6:data, 
7:init_code) 
) ; 
TASK 


MAIN_TASK 


( 
DPL 
= 0, 
DATA 
= DATA, 


CODE 
= main, 


NO 
INTENABLED, 
PRESENT 
) ; 


Give 
all 
user 
segments 
a DPL 
of O. 
These 
two 
segments 
are 
created 
by 
the 
builder 
when 
the 
FLAT 
control 
is used. 
Put 
startup 
code 
at the 
reset 
vector 
area. 


trap 
gate 
disables 
interrupts 
interrupt 
gates 
doesn't 


In a buffer 
starting 
at 
GDT_DESC, 


BLD386 
places 
the 
GDT 
base 
and 
GDT 
limit 
values. 
Buffer 
must 
be 
6 bytes 
long. 
The 
base 
and 
limit 
values 
are 
places 
in this 
buffer 
as two 
bytes 
of limit 
plus 
four 
bytes 
of base 
in the 
format 
required 
for 
use 
by 
the 
LGDT 
instruction. 
Explicitly 
place 
segment 
-- entries 
into 
the 
GDT. 


Task 
privilege 
level 
is O. 
Points 
to a segment 
that 
indicates 
initial 
DS value. 
Entry 
point 
is main, 
which 
must 
be a public 
id. 


Segment 
id points 
to 
stack 
segment. 
Sets 
the 
initial 
SS:ESP. 
Disable 
interrupts. 
Present 
bit 
in TSS 
set 
to 
1. 


MEMORY 
(RANGE 
= 
(EPROM 
= ROM(Offff8000h 
••Offffffffhl, 
DRAM 
= RAM(O ••Offffh)), 
ALLOCATE 
= 
(EPROM 
= 
(MAIN_TASK))); 


asm386 
flatsim.a38 
debug 
asm386 
application.a38 
debug 
bnd386 
application.obj,flatsim.obj 
nolo 
debug 
oj 
(application.bnd) 
bld386 
application.bnd 
bf 
(flatsim.bldl 
bl 
flat 


Commands 
to assemble 
and build a boot-Ioadable 
application 
named "application.a38". 
The initialization 
code 
is called 
"f1atsim.a38", 
and build file is called 
"application. 
bid". 


inter 


The 80376, like the 80386, has the capability to per- 
form a self-test. The self-test checks the function of 
all of the Control ROM and most of the non-random 
logic of the part. Approximately one-half of the 
80376 can be tested during self-test. 


Self-Test is initiated on the 80376 when the RESET 
pin transitions from HIGH to LOW, and the BUSY# 
pin is LOW. The self-test takes about 220 clocks, or 
approximately 33 ms with a 16 MHz 80376 proces- 
sor. At the completion of self-test the processor per- 
forms reset and begins normal operation. The part 
has successfully passed self-test if the contents of 
the EAX register is zero. If the EAX register is not 
zero then the self-test has detected a flaw in the 
part. If self-test is not selected after reset, EAX may 
be non-zero after reset. 


BRE.4KPOINT 
0 
DEBUG 
FAULT!TRAP 


BREAKPOINT 
1 DEBUG 
F'AUlT/TRAP 


BREAKPOINT 
2 
DEBUG 
FAULT/TRAP 


BREAKPOINT 
3 DEBUG 
FAULT/TRAP 


REGISTER 
ACCESS 
FAULT 


SINGlE-STEP 
DEBUG TRAP 


TASK 
SWITCH 
DEBUG 
TRAP 


GJ: GLOBAL 
BREAKPOINT 
ENABLE 
I 
ll: LOCAL 
BREAKPOINT 
ENABLE 
I 


LOCAL EXACT BREAKPOINT "',UCH 


GLOBAL 
EXACT 
BREAKPOINT 
•••,uCH 


GLOBAL 
DEBUG 
REGISTER 
ACCESS 
DETECT 


The 80376 provides several features which simplify 
the debugging process. The three categories of on- 
chip debugging aids are: 
1. The code execution breakpoint opcode (OCCH). 
2. The single-step capability provided by the TF bit 
in the flag register, and 
3. The code and data breakpoint capability provided 
by the Debug Registers DRO-3, DR6, and DR7. 


A single-byte software interrupt (Int 3) breakpoint in- 
struction is available for use by software debuggers. 
The breakpoint opcode is OCCh,and generates an 
exception 3 trap when executed. 


DEBUG 
STATUS 
REGISTER 


inter 


If the single-step flag (TF, bit 8) in the EFLAG regis- 
ter is found to be set at the end of an instruction, a 
single-step exception occurs. The single-step ex- 
ception is auto vectored to exception number 1. 


The Debug Registers are an advanced debugging 
feature of the 80376. They allow data access break- 
points as well as code execution breakpoints. Since 
the breakpoints are indicated by on-chip registers, 
an instruction execution breakpoint can be placed in 
ROM code or in code shared by several tasks, nei- 
ther of which can be supported by the INT 3 break- 
point opcode. 


The 80376 contains six Debug Registers, consisting 
of four breakpoint address registers and two break- 
point control registers. Initially after reset, break- 
points are in the disabled state; therefore, no break- 
points will occur unless the debug registers are 
programmed. Breakpoints set up in the 
Debug 
Registers 
are 
auto-vectored 
to 
exception 
1. 
Figure 2.6 shows the breakpoint status and control 
registers. 


ACCESS 
RIGHTS 


llt.4IT 


BASE ADDRESS 


The Intel 80376 Embedded Processor has a physi- 
cal address space of 16 Mbytes (224 bytes) and al- 
lows the running of virtual memory programs of al- 
most unlimited size (16 Kbytes x 
16 Mbytes or 
256 Gbytes (238 bytes)). In addition the 80376 pro- 
vides a sophisticated memory management and a 
hardware-assisted protection mechanism. 


3.1 Addressing 
Mechanism 


The 80376 uses two components to form the logical 
address, a 16·bit selector which determines the lin- 
ear base address of a segment, and a 32·bit effec- 
tive address. The selector is used to specify an 
index into an operating system defined table (see 
Figure 3.1). The table contains the 32-bit base ad- 
dress of a given segment. The linear address is 
formed by adding the base address obtained from 
the table to the 32-bit effective address. This value 
is truncated to 24 bits to form the physical address, 
which is then placed on the address bus. 


inter 


Segmentation is one method of memory manage- 
ment and provides the basis for protection in the 
80376. Segments are used to encapsulate regions 
of memory which have common attributes. For ex- 
ample, all of the code of a given program could be 
contained in a segment, or an operating system ta- 
ble may reside in a segment. All information about 
each segment, is stored in an 8-byte data structure 
called a descriptor. All of the descriptors in a system 
are contained in tables recognized by hardware. 


The following terms are used throughout the discus- 
sion of descriptors, privilege levels and protection: 
PL: 
Privilege 
Level-Cne 
of the four hierarchical 
privilege levels. Level 0 is the most privileged 
level and level 3 is the least privileged. 
RPL: 
Requestor 
Privilege 
Level-The 
privilege 
level of the original supplier of the selector. 
RPL is determined by the least two significant 
bits of a selector. 
DPL: 
Descriptor 
Privilege Level-This 
is the least 
privileged level at which a task may access 
that descriptor (and the segment associated 
with that descriptor). Descriptor Privilege Lev- 
el is determined by bits 6:5 in the Access 
Right Byte of a descriptor. 


CPL: 
Current 
Privilege 
Level-The 
privilege level 
at which a task is currently executing, which 
equals the privilege level of the code seg- 
ment being executed. CPL can also be deter- 
mined by examining the lowest 2 bits of the 
CS register, except for conforming code seg- 
ments. 
EPL: 
Effective 
Privilege 
Level-The 
effective 
privilege level is the least privileged of the 
RPL and the DPL. EPL is the numerical maxi- 
mum of RPL and DPL. 


Task: One instance of the execution of a program. 
Tasks are also referred to as processes. 


The descriptor tables define all of the segments 
which are used in an 80376 system. There are three 
types of tables on the 80376 which hold descriptors: 
the Global Descriptor Table, Local Descriptor Table, 
and the Interrupt Decriptor Table. All of the tables 
are variable length memory arrays, they can range in 
size between 8 bytes and 64 Kbytes. Each table can 
hold up to 8192 8-byte descriptors. The upper 13 
bits of a selector are used as an index into the de- 
scriptor table. The tables have registers associated 
with them which hold the 32-bit linear base address, 
and the 16-bit limit of each table. 


Each of the tables have a register associated with it: 
GDTR, LDTR and IDTR; see Figure 3.2. The LGDT, 
LLDT and L1DTinstructions load the base and limit 
of the Global, Local and Interrupt Descriptor Tables 
into the appropriate register. The SGDT, SLDT and 
SIDT store these base and limit values. These are 
privileged instructions. 


LOT BASE 
LINEAR 
ADDRESS 
o 
32 
lOT L1t.4IT 
PROGRAt.4 INVISIBLE 
AUTOt.4ATICALL Y LOADED 
FROt.4 LOT DESCRIPTOR 


Global Descriptor 
Table 


The Global Descriptor Table (GDT) contains de- 
scriptors which are possibly available to all of the 
tasks in a system. The GDT can contain any type of 
segment descriptor except for interrupt and trap de- 
scriptors. Every 80376 system contains a GDT. A 
simple 80376 system contains only 2 entries in the 
GDT; a code and a data descriptor. 


The first slot of the Global Descriptor Table corre- 
sponds to the null selector and is not used. The null 
selector defines a null pointer value. 


Local Descriptor 
Table 


LDTs contain descriptors which are associated with 
a given task. Generally, operating systems are de- 
signed so that each task has a separate LDT. The 
LDT may contain only code, data, stack, task gate, 
and call gate descriptors. LDTs provide a mecha- 
nism for isolating a given task's code and data seg- 
ments from the rest of the operating system, while 
the GDT contains descriptors for segments which 
are common to all tasks. A segment cannot be ac- 
cessed by a task if its segment descriptor does not 
exist in either the current LDT or the GDT. This pro- 


vides both isolation 
and protection 
for a task's 
seg- 
ments, 
while 
still allowing 
global 
data to be shared 
among tasks. 


Unlike the 6-byte GOT or lOT registers which contain 
a base address 
and limit, the visible 
portion 
of the 
LOT register 
contains 
only a 16-bit selector. 
This se- 
lector refers to a Local Descriptor 
Table descriptor 
in 
the GOT (see Figure 2.1). 


The third table 
needed 
for 80376 systems 
is the In- 
terrupt 
Descriptor 
Table. 
The 
lOT contains 
the de- 
scriptors 
which 
point 
to the 
location 
of up to 256 
interrupt 
service 
routines. 
The lOT may contain 
only 
task gates, 
interrupt 
gates 
and trap gates. 
The lOT 
should 
be at least 256 bytes in size in order to hold 
the descriptors 
for the 32 Intel Reserved 
Interrupts. 
Every interrupt 
used by a system 
must have an entry 
in the 
lOT. The 
lOT entries 
are referenced 
by INT 
instructions, 
external 
interrupt 
vectors, 
and 
excep- 
tions. 


The object 
to which 
the segment 
selector 
points to 
is called 
a 
descriptor. 
Descriptors 
are 
eight-byte 
quantities 
which 
contain 
attributes 
about 
a given 
region 
of linear address 
space. 
These 
attributes 
in- 
clude 
the 
32-bit 
logical 
base 
address 
of the 
seg- 
31 


ment, 
the 20-bit 
length 
and granularity 
of the seg- 
ment, 
the 
protection 
level, 
read, 
write 
or execute 
privileges, 
and the type of segment. 
All of the attri- 
bute information 
about a segment 
is contained 
in 12 
bits in the segment 
descriptor. 
Figure 3.3 shows the 
general 
format 
of a descriptor. 
All segments 
on the 
the 80376 have three attribute 
fields in common: 
the 
Present 
bit (P), the 
Descriptor 
Privilege 
Level 
bits 
(DPL) and the Segment 
bit (S). P= 1 if the segment 
is loaded 
in physical 
memory, 
if P = 
0 then 
any 
attempt 
to access the segment 
causes a not present 
exception 
(exception 
11). The DPL is a two-bit 
field 
which specifies 
the protection 
level, 0-3, 
associated 
with a segment. 


The 
80376 
has two 
main 
categories 
of segments: 


system 
segments, 
and 
non-system 
segments 
(for 
code and data). The segment 
bit, S, determines 
if a 
given 
segment 
is a system 
segment, 
a code 
seg- 
ment or a data segment. 
If the S bit is 1 then the 
segment 
is either 
a code or data segment, 
if it is 0 
then the segment 
is a system 
segment. 


Note 
that 
although 
the 
80376 
is 
limited 
to 
a 
16-Mbyte 
Physical address 
space (224), its base ad- 


dress allows a segment 
to be placed 
anywhere 
in a 
4-Gbyte linear address space. When writing code for 
the 80376, 
users should 
keep code protability 
to an 
80386 
processor 
(or other 
processors 
with a larger 
physical 
address 
space) 
in mind. A segment 
base 
address 
can be placed anywhere 
in this 4-Gbyte 
lin- 
ear address 
space, 
but a physical 
address 
will be 


o 
BYTE 
ADDRESS 
o 


+4 


SEGMENT 
BASE 15 ... 
0 
SEGMENT 
LIMIT 15 ... 
0 


BASE 
A 
LIMIT 
BASE 
31 . ., 24 
G 
1 0 V 
19 . .,16 
P 
DPL 
S 
TYPE 
A 
23 ... 
16 
L 
I 
I 
I 
BASE 
Base Address of the segment 
LIMIT 
The length of the segment 
P 
PresentBit 
1 = Present 
0 = Not Present 
DPL 
DescriptorPrivilegeLevel 0-3 
S 
Segment Descriptor: 
0 = System Descriptor, 
1 = Code or Data Descriptor 
TYPE 
Type of Segment 
A 
Accessed Bit 
G 
GranularityBit 
1 = Segment length is 4 Kbyte Granular 
o = Segment length is byte granular 
o 
Bit must be zero (0) for compatibilitywith future processors 
AVL 
Availablefield for user or OS 


Figure 3.3. Segment 
Descriptors 


SEGMENT 
BASE 15 ... 
0 
SEGMENT 
LIMIT 15 ... 
0 


BASE 
A 
LIMIT 
ACCESS 
BASE 
G 
1 
0 
V 
RIGHTS 
31 ... 
24 
L 
19 . .,16 
BYTE 
23 ... 
16 


G 
GranularityBit 
1 = Segment length is 4 Kbyte granular 
o = Segment length is byte granular 
o 
Bit must be zero (0) for compatibilitywith future processors 
AVL Availablefield for user or OS 


Figure 3.4. Code and Data Descriptors 
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Bit 
Name 
Function 
Position 


7 
Present (P) 
P = 1 
Segment is mapped into physical memory. 
P=O 
No mapping to physical memory exits 
6-5 
Descriptor Privilege 
Segment privilege attribute used in privilege tests. 
Level (DPL) 
4 
Segment 
S = 1 
Code or Data (includes stacks) segment descriptor 
Descriptor (S) 
S=O 
System Segment Descriptor or Gate Descriptor 


3 
Executable (E) 
E=O 
Descriptor type is data segment: 
} 


If 
2 
Expansion 
ED = 0 Expand up segment, offsets must be s: limit. 
Data 
Direction (ED) 
ED = 1 Expand down segment, offsets must be > limit. 
Segment 
1 
Writable (W) 
W = 0 Data segment may not be written into. 
(S = 1, 
W= 
1 Data segment may be written into. 
E = 0) 


3 
Executable (E) 
E = 1 
Descriptor type is code segment: 
} 


If 
2 
Conforming (C) 
C=1 
Code segment may only be executed when 
Code 
CPL ~ DPL and CPL remains unchanged. 
Segment 
1 
Readable (R) 
R=O 
Code segment may not be read. 
(S = 1, 
R = 1 
Code segment may be read. 
E = 1) 


0 
Accessed (A) 
A=O 
Segment has not been accessed. 
A = 1 
Segment selector has been loaded into segment register 
or used by selector test instructions. 


generated that is a truncated version of this linear 
address. Truncation will be to the maximum number 
of address bits. It is recommended to place EPROM 
at the highest physical address and DRAM at the 
lowest physical addresses. 


Code and Data Descriptors 
(S= 1) 


Figure 3.4 shows the general format of a code and 
data descriptor and Table 3.1 illustrates how the bits 
in the Access Right Byte are interpreted. 


Code and data segments have several descriptor 
fields in common. The accessed bit, A, is set when- 
ever the processor accesses a descriptor. The gran- 
ularity bit, G, specifies if a segment length is 1-byte- 
granular or 4-Kbyte-granular. Base address bits 
31-24, which are normally found in 80386 descrip- 
tors, are not made externally available on the 80376. 
They do not affect the operation of the 80376. The 
A31-A24 field should be set to allow an 80386 to 
correctly execute with EPROM at the upper 4096 
Mbytes of physical memory. 


System 
Descriptor 
Formats 
(S = 0) 


System,segments describe information about oper- 
ating system tables, tasks, and gates. Figure 3.5 
shows the general format of system segment de- 
scriptors, and the various types of system segments. 


80376 system descriptors (which are the same as 
80386 descriptor types 2, 5, 9, B, C, E and F) contain 
a 32-bit logical base address and a 20-bit segment 
limit. 


Selector 
Fields 


A selector has three fields: Local or Global Descrip- 
tor Table Indicator (TI), Descriptor Entry Index (In- 
dex), and Requestor ( the selector's) Privilege Level 
(RPL) as shown in Figure 3.6. The TI bit selects ei- 
ther the Global Descriptor Table or the Local De- 
scriptor Table. The Index selects one of 8K descrip- 
tors in the appropriate descriptor table. The RPL bits 
allow high speed testing of the selector's privilege 
attributes. 


Segment 
Descriptor 
Cache 


In addition to the s~lector value, every segment reg- 
ister has a segment descriptor cache register asso- 
ciated with it. Whenever a segment register's con- 
tents are changed, the 8-byte descriptor associated 
with that selector is automatically loaded (cached) 
on the chip. Once loaded, all references to that seg- 
ment use the cached descriptor information instead 
of reaccessing the descriptor. The contents of the 
descriptor cache are not visible to the programmer. 
Since descriptor caches only change when a seg- 
ment register is changed, programs which modify 
the descriptor tables must reload the appropriate 
segment registers after changing a descriptor's 
value. 
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Figure 3.6. Example Descriptor Selection 


-Data 
stored in a segment with privilege level p 
can be accessed only by code executing at a 
privilege level at least as privileged as p. 
The 80376 offers extensive protection features. 
These protection features are particularly useful in 
sophisticated 
embedded 
applications which 
use 
multitasking real-time operating systems. For sim- 
pler embedded applications these protection capa- 
bilities can be easily bypassed by making all applica- 
tions run at privilege level (PL) O. 


-A 
code segment/procedure with privilege level p 
can only be called by a task executing at the 
same or a lesser privilege level than p. 


At any point in time, a task on the 80376 always 
executes at one of the four privilege levels. The Cur- 
rent Privilege Level (CPL) specifies what the task's 
privilege level is. A task's CPL may only be changed 
The 80376 controls access to both data and proce- 
dures between levels of a task, according to the fol- 
lowing rules. 


by control transfers through gate descriptors to a 
code segment with a different privilege level. Thus, 
an application program running at PL=3 may call an 
operating system routine at PL= 1 (via a gate) which 
would cause the task's CPL to be set to 1 until the 
operating system routine was finished. 


Selector 
Privilege 
(RPL) 


The privilege level of a selector is specified by the 
RPL field. The selector's RPL is only used to estab- 
lish a less trusted privilege level than the current 
privilege level of the task for the use of a segment. 
This level is called the task's effective privilege level 
(EPL). The EPL is defined as being the least privi- 
leged (numerically larger) level of a task's CPL and a 
selector's RPL. The RPL is most commonly used to 
verify that pointers passed to an operating system 
procedure do not access data that is of higher privi- 
lege than the procedure that originated the pointer. 
Since the originator of a selector can specify any 
~PL value, the Adjust RPL (ARPL) instruction is pro- 
vided to force the RPL bits to the originator's CPL. 


110 Privilege 


The I/O privilege level (IOPL) lets the operating sys- 
tem code executing at CPL=0 define the least privi- 
leged level at which I/O instructions can be used. An 
exception 13 (General Protection Violation) is gener- 
ated if an I/O instruction is attempted when the CPL 
of the task is less privileged than the 10PL. The 
10PL is stored in bits 13 and 14 of the EFLAGS reg- 
ister. The following instructions cause an exception 
13 if the CPL is greater than 10PL: IN, INS, OUT, 
OUTS, STI, CLI and LOCK prefix. 


Descriptor 
Access 


There are basically two types of segment acces- 
sess: those involving code segments such as con- 
trol transfers, and those involving data accesses. 
Determining the ability of a task to access a seg- 
ment involves the type of segment to be accessed 
the instruction used, the type of descriptor used and 
CPL, RPL, and OPL as described above. 


Any time an instruction loads a data segment regis- 
ter (OS, ES, FS, GS) the 80376 makes protection 
validation checks. Selectors loaded in the OS, ES, 
FS, GS registers must refer only to data segment or 
readable code segments. 


Finally the privilege validation checks are performed. 
The CPL is compared to the EPL and if the EPL is 
more privileged than the CPL, an exception 13 (gen- 
eral protection fault) is generated. 


The rules regarding the stack segment are slightly 
different than those involving data segments. In- 
structions that load selectors into SS must refer to 
data segment descriptors for writeable data seg- 
ments. The OPL and RPL must equal the CPL of all 
other descriptor types or a privilege level violation 
will cause an exception 13. A stack not present fault 
causes an exception 12. 


Inter-segment control transfers occur when a selec- 
tor is loaded in the CS register. For a typical system 
most of these transfers are simply the result of a call 
or a jump to another routine. There are five types of 
control transfers which are summarized in Table 3.2. 
Many of these transfers result in a privilege level 
transfer. Changing privilege levels is done only by 
control transfers, using gates, task switches, and in- 
terrupt or trap gates. 


Control transfers can only occur if the operation 
which loaded the selector references the correct de- 
scriptor type. Any violation of these descriptor usage 
rules will cause an exception 13. 


Gates provide protected indirect CALLs. One of the 
major uses of gates is to provide a secure method of 
privilege tr~nsfers within a task. Since the operating 
system defines all of the gates in a system, it can 
ensure that all gates only allow entry into a few trust- 
ed procedures. 


Control 
Transfer 
Types 
Operation 
Types 
Descriptor 
Descriptor 


Referenced 
Table 


Intersegment 
within the same privilege 
level 
JMP, CALL, RET, IREP 
Code Segment 
GOT/LOT 


Intersegment 
to the same or higher privilege 
level 
CALL 
Call Gate 
GOT/LOT 
Interrupt within task may change CPL 
Interrupt 
Instruction, 
Trap or 
lOT 
Exception, 
External 
Interrupt 
Interrupt 
Gate 


Intersegment 
to a lower privilege 
level 
RET,IREP 
Code Segment 
GOT/LOT 
(changes 
task CPL) 


CALL, JMP 
Task State 
GOT 
Segment 


Task Switch 
CALL, JMP 
Task Gate 
GOT/LOT 


IREP' 
Task Gate 
lOT 
Interrupt 
Instruction, 
Exception, 
External 
Interrupt 
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A very important attribute of any multi-tasking oper- 
ating system is its ability to rapidly switch between 
tasks or processes. The 80376 directly supports this 
operation by providing a task switch instruction in 
hardware. The 80376 task switch operation saves 
the entire state of the machine (all of the registers, 
address space, and a link to the previous task), 
loads a new execution state, performs protection 
checks, and commences execution in the new task. 
Like transfer of control by gates, the task switch op- 
eration is invoked by executing an inter-segment 
JMP or CALL instruction which refers to a Task 
State Segment (TSS), or a task gate descriptor in 
the GOT or LOT. An INT n instruction, exception, 
trap or external interrupt may also invoke the task 
switch operation if there is a task gate descriptor in 
the associated lOT descriptor slot. For simple appli- 
cations, the TSS and task switching may not be 
used. The TSS or task switch will not be used or 
occur if no task gates are present in the GOT, LOT 
or lOT. 


The TSS descriptor points to a segment (see Figure 
3.7) containing the entire 80376 execution state. A 
task gate descriptor contains a TSS selector. The 
limit of an 80376 TSS must be greater than 64H, and 
can be as large as 16 Mbytes. In the additional TSS 
space, the operating system is free to store addition- 
al information as the reason the task is inactive, the 
time the task has spent running, and open files be- 
longing to the task. 


Each Task must have a TSS associated with it. The 
current TSS is identified by a special register in the 
80376 called the Task State Segment Register (TR). 
This register contains a selector referring to the task 
state segment descriptor that defines the current 
TSS. A hidden base and limit register associated 
with the TSS descriptor is loaded whenever TR is 
loaded with a new selector. Returning from a task is 
accomplished by the IRET instruction. When IRET is 
executed, control is returned to the task which was 


interrupted. The current executing task's state is 
saved in the TSS and the old task state is restored 
from its TSS. 


Several bits in the flag register and CROregister give 
information about the state of a task which is useful 
to the operating system. The Nested Task bit, NT, 
controls the function of the IRET instruction. If NT = 
o the IRET instruction performs the regular return. If 
NT = 1, IRET performs a task switch operation 
back to the previous task. The NT bit is set or reset 
in the following fashion: 


When a CALL or INT instruction initiates a task 
switch, the new TSS will be marked busy and 
the back link field of the new TSS set to the old 
TSS selector. The NT bit of the new task is set 
by CALL or INT initiated task switches. An inter- 
rupt that does not cause a task switch will clear 
NT (The NT bit will be restored after execution 
of the interrupt handler). NT may also be set or 
cleared by POPF or IRET instructions. 


The 80376 task state segment is marked busy by 
changing the descriptor type field from TYPE 9 to 
TYPE OBH.Use of a selector that references a busy 
task state segment causes an exception 13. 


The coprocessor's state is not automatically saved 
when a task switch occurs. The Task Switched Bit, 
TS, in the CROregister helps deal with the coproces- 
sor's state in a multi-tasking environment. Whenever 
the 80376 switches tasks, it sets the TS bit. The 
80376 detects the first use of a processor extension 
instruction after a task switch and causes the proc- 
essor extension not available exception 7. The ex- 
ception handler for exception 7 may then decide 
whether to save the state of the coprocessor. 


The T bit in the 80376 TSS indicates that the proc- 
essor should generate a debug exception when 
switching to a task. If T = 1 then upon entry to a 
new task a debug exception 1 will be generated. 
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The I/O instructions 
that directly 
refer to addresses 
in the processor's 
I/O space 
are IN, INS, OUT and 
OUTS. The 80376 
has the ability to selectively 
trap 
references 
to specific 
I/O addresses. 
The structure 
that 
enables 
selective 
trapping 
is the I/O PermIs- 
sIon BIt Map in the TSS segment 
(see Figures 
3.7 
and 
3.8). The 
I/O 
permission 
map is a bit vector. 
The size of the map and its location 
in the TSS seg- 
ment 
are variable. 
The 
processor 
locates 
the 
I/O 
permission 
map by means of the I/O map base field 
in the fixed 
portion 
of the TSS. The I/O 
map 
base 
field 
is 16 bits wide and contains 
the offset 
of the 
beginning 
of the I/O permission 
map. 


If an I/O 
instruction 
(IN, INS, OUT or OUTS) is en- 
countered, 
the 
processor 
first 
checks 
whether 
CPL ~ 10PL. If this condition 
is true, the I/O opera- 
tion may proceed. 
If not true, the processor 
checks 
the I/O permission 
map. 


Each bit in the map corresponds 
to an I/O port byte 
address; 
for example, 
the bit for port 41 is found 
at 
I/O map base + 5 linearly, 
(5 x 8 = 40), bit offset 
1. The processor 
tests all the bits that correspond 
to 
the I/O addresses 
spanned 
by an I/O operation; 
for 
example, 
a double word operation 
tests four bits cor- 
responding 
to four adjacent 
byte addresses. 
If any 
tested 
bit is set, the processor 
signals a general 
pro- 
tection 
exception. 
If all the tested 
bits are zero, the 
I/O operations 
may proceed. 
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It is not necessary 
for the 
I/O 
permission 
map to 
represent 
all the I/O addresses. 
I/O 
addresses 
not 
spanned 
by the map are treated 
as if they had one- 
bits in the 
map. The 
I/O 
map 
base 
should 
be at 
least one byte less than the TSS limit and the last 
byte beyond 
the I/O mapping 
information 
must con- 
tain all 1's. 


Because 
the I/O permission 
map is in the TSS seg- 
ment, different 
tasks can have different 
maps. Thus, 
the operating 
system 
can allocate 
ports to a task by 
changing 
the I/O permission 
map in the task's 
TSS. 


IMPORTANT 
IMPLEMENTATION 
NOTE: 
Beyond 
the last byte of I/O 
mapping 
information 
in 
the I/O permission 
bit map must 
be a byte contain- 
ing all 1'so The 
byte 
of all 1's must 
be within 
the 
limit of the 80376's 
TSS segment 
(see Figure 3.7). 


The 
Intel 
80376 
embedded 
processor 
features 
a 
straightforward 
functional 
interface 
to the 
external 
hardware. 
The 
80376 
has separate 
parallel 
buses 
for data 
and 
address. 
The 
data 
bus 
is 16 bits 
in 
width, 
and 
bidirectional. 
The 
address 
bus 
outputs 
24-bit 
address 
values 
using 
23 address 
lines 
and 
two-byte 
enable 
signals. 


The 80376 
has two selectable 
address 
bus cycles: 


pipelined 
and 
non-pipelined. 
The 
pipelining 
option 
allows as much time as possible 
for data access 
by 
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starting the pending bus cycle before the present 
bus cycle is finished. A non-pipelined bus cycle 
gives the highest bus performance by executing ev- 
ery bus cycle in two processor clock cycles. For 
maximum design flexibility, the address pipelining 
option is selectable on a cycle-by-cycle basis. 


The processor's bus cycle is the basic mechanism 
for information transfer, either from system to proc- 
essor, or from processor to system. 80376 bus cy- 
cles perform data transfer in a minimum of only two 
clock periods. On a 16-bit data bus, the maximum 
80376 transfer bandwidth at 16 MHz is therefore 
16 Mbytes/sec. However, any bus cycle will be ex- 
tended for more than two clock periods if external 
hardware withholds acknowledgement of the cycle. 


The 80376 can relinquish control of its local buses 
to allow mastership by other devices, such as direct 
memory 
access 
(DMA) 
channels. 
When 
relin- 
quished, HLDA is the only output pin driven by the 
80376, providing near-complete 
isolation of the 
processor from its system (all other output pins are 
in a float condition). 


4.1 Signal Description 
Overview 


Ahead is a brief description of the 80376 input and 
output signals arranged by functional groups. Note 
the # symbol at the end of a signal name indicates 
the active, or asserted, state occurs when the signal 
is at a LOW voltage. When no # is present after the 
signal name, the signal is asserted when at the 
HIGH voltage level. 


Example signal: M/IO#-HIGH 
voltage indicates 


Memory selected 


-LOW 
voltage indicates 
I/O selected 


The signal descriptions sometimes refer to A.C. tim- 
ing parameters, such as "t25 Reset Setup Time" and 
"t26 Reset Hold Time." The values of these parame- 
ters can be found in Table 6.4. 


CLK2 provides the fundamental 
timing for 
the 
80376. It is divided by two internally to generate the 
internal processor clock used for instruction execu- 
tion. 
The 
internal 
clock 
is 
comprised 
of 
two 


ClK2 [ 


INTERNAL [ 
PROCESSOR CLOCK 


phases, "phase one" and "phase two". Each CLK2 
period is a phase of the internal clock. Figure 4.2 
illustrates the relationship. If desired, the phase of 
the internal processor clock can be synchronized to 
a known phase by ensuring the falling edge of the 
RESET signal meets the applicable setup and hold 
times t25 and t26. 


These three-state bidirectional signals provide the 
general purpose data path between the 80376 and 
other devices. The data bus outputs are active HIGH 
and will float during bus hold acknowledge. Data bus 
reads require that read-data setup and hold times 
t21 and t22 be met relative to CLK2 for correct oper- 
ation. 


During coprocessor I/O transfers, A22-A16 are driv- 
en LOW, and A23 is driven HIGH so that this ad- 
dress line can be used by external logic to generate 
the coprocessor select signal. Thus, the I/O address 
driven by thEl80376 for coprocessor commands is 
8000F8H, and the I/O address driven by the 80376 
processor for coprocessor data is 8000FCH or 
8000FEH. 


The address bus is capable of addressing 16 Mbytes 
of 
physical 
memory 
space 
(OOOOOOHthrough 
OFFFFFFH), and 64 Kbytes of I/O address space 
(OOOOOOH 
through OOFFFFH)for programmed I/O. 


The address bus is active HIGH and will float during 
bus hold acknowledge. 


The Byte Enable outputs BHE# and BLE# directly 
indicate which bytes of the 16-bit data bus are in- 
volved with the current transfer. BHE# applies to 
015-08 and BLE# applies to 07-00. 
If both BHE# 
and BLE# are asserted, then 16 bits of data are 
being transferred. See Table 4.1 for a complete de- 
coding of these signals. The byte enables are active 
LOW and will float during bus hold acknowledge. 


These three-state outputs provide physical memory 
addresses or I/O port addresses. A23-A16 are LOW 
during I/O transfers except for I/O transfers auto- 
matically generated by coprocessor 
instructions. 


Table 4.1. Byte Enable Definitions 


BHE# 
BLE# 
Function 


0 
0 
Word Transfer 


0 
1 
Byte Transfer on Upper Byte of the Data Bus, 015-08 


1 
0 
Byte Transfer on Lower Byte of the Data Bus, OrOo 


1 
1 
Never Occurs 


BUS CYCLE DEFINITION SIGNALS 
(W/R#, 
D/C#, 
M/IO#, 
LOCK#) 


These three-state outputs define the type of bus cy- 
cle being performed: W/R# 
distinguishes between 
write and read cycles, D/C# 
distinguishes between 
data and control cycles, M/IO# 
distinguishes be- 
tween memory and 1/0 cycles, and LOCK# distin- 
guishes between locked and unlocked bus cycles. 
All of these signals are active LOW and will float 
during bus acknowledge. 


The primary bus cycle definition signals are W/R#, 
D/C# 
and M/IO#, 
since these are the signals driv- 
en valid as ADS# (Address Status output) becomes 
active. The LOCK# signal is driven valid at the same 
time the bus cycle begins, which due to address 
pipelining, could be after ADS# becomes active. Ex- 
act bus cycle definitions, as a function of WIR #, 
D/C# 
and M/IO# 
are given in Table 4.2. 


LOCK# indicates that other system bus masters are 
not to gain control of the system bus while it is ac- 
tive. LOCK# is activated on the CLK2 edge that be- 
gins the first locked bus cycle (I.e., it is not active at 
the same time as the other bus cycle definition pins) 
and is deactivated when ready is returned to the end 
of the last bus cycle which is to be locked. The be- 
ginning of a bus cycle is determined when READY# 
is returned in a previous bus cycle and another is 
pending (ADS# 
is active) or the clock in which 
ADS# 
is driven active if the bus was idle. This 
means that it follows more closely with the write 
data rules when it is valid, but may cause the bus to 
be locked longer than desired. The LOCK# signal 
may be explicitly activated by the LOCK prefix on 
certain 
instructions. 
LOCK# 
is always asserted 
when executing the XCHG instruction, during de- 
scriptor updates, and during the interrupt acknowl- 
edge sequence. 


BUS CONTROL SIGNALS 
(ADS#, READY#, NA#) 


The following signals allow the processor to indicate 
when a bus cycle has begun, and allow other system 
hardware to control address pipelining and bus cycle 
termination. 


Address Status (ADS# ) 


This three-state output indicates that a valid bus cy- 
cle definition and address (W/R#, 
D/C#, 
M/IO#, 


BHE#, BLE# and A23-Al) 
are being driven at the 
80376 pins. ADS# is an active LOW output. Once 
ADS# is driven active, valid address, byte enables, 
and definition signals will not change. In addition, 
ADS# will remain active until its associated bus cy- 
cle begins (when READY# is returned for the previ- 
ous bus cycle when running pipelined bus cycles). 
ADS# will float during bus hold acknowledge. See 
sections Non-Pipelined Bus Cycles (page 43) and 
Pipelined Bus Cycles (page 45) for additional infor- 
mation on how ADS# is asserted for different bus 
states. 


Transfer Acknowledge 
(READY #) 


This input indicates the current bus cycle is com- 
plete, and the active bytes indicated by BHE# and 
BLE# are accepted or provided. When READY# is 
sampled active during a read cycle or interrupt ac- 
knowledge cycle, the 80376 latches the input data 
and terminates the cycle. When READY# is sam- 
pled active during a write cycle, the processor termi- 
nates the bus cycle. 


M/IO# 
D/C# 
W/R# 
Bus Cycle Type 
Locked? 


0 
0 
0 
INTERRUPTACKNOWLEDGE 
Yes 


0 
0 
1 
Does Not Occur 
- 


0 
1 
0 
1/0 DATA READ 
No 


0 
1 
1 
1/0 DATA WRITE 
No 


1 
0 
0 
MEMORY CODE READ 
No 


1 
0 
1 
HALT: 
SHUTDOWN: 
No 
Address = 2 
Address = 0 
BHE# = 1 
BHE# = 1 
BLE# = 0 
BLE# = 0 


1 
1 
0 
MEMORY DATA READ 
Some Cycles 


1 
1 
1 
MEMORY DATA WRITE 
Some Cycles 
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READY# is ignored on the first bus state of all bus 
cycles, and sampled each bus state thereafter until 
asserted. READY# must eventually be asserted to 
acknowledge every bus cycle, including Halt Indica- 
tion and Shutdown Indication bus cycles. When be- 
ing sampled, READY# must always meet setup and 
hold times t19 and t20 for correct operation. 


Next Address 
Request 
(NA #) 


This is used to request pipelining. This input indi- 
cates the system is prepared to accept new values 
of 
BHE#, 
BLE#, 
A23-Al' 
W/R#, 
D/C# 
and 
M/lO# 
from the 80376 even if the end of the current 
cycle is not being acknowledged on READY#. If this 
input is active when sampled, the next bus cycle's 
address and status signals are driven onto the bus, 
provided the next bus request is already pending in- 
ternally. NA# 
is ignored in clock cycles in which 
ADS# or READY# is activated. This signal is active 
LOW and must satisfy setup and hold times t15 and 
t16for correct operation. See Plpellned 
Bus Cycles 
(page 45) and Read and Write Cycles (page 42) for 
additional information. 


BUS ARBITRATION 
SIGNALS 
(HOLD, 
HLDA) 


This section describes the mechanism by which the 
processor relinquishes control of its local buses 
when requested by another bus master device. See 
Entering 
and 
Exiting 
Hold 
Acknowledge 
(page 
52) for additional information. 


Bus Hold Request 
(HOLD) 


This input indicates some device other than the 
80376 requires bus mastership. When control is 
granted, the 80376 floats A23-Al' 
BHE#, BLE#, 
D15-DO' 
LOCK#, 
M/lO#, 
D/C#, 
W/R# 
and 
ADS#, and then activates HLDA, thus entering the 
bus hold acknowledge state. The local bus will re- 
main granted to the requesting master until HOLD 
becomes inactive. When HOLD becomes inactive, 
the 80376 will deactivate HLDA and drive the local 
bus (at the same time), thus terminating the hold 
acknowledge condition. 


HOLD must remain asserted as long as any other 
device is a local bus master. External pull-up resis- 
tors may be required when in the hold acknowledge 
state since none of the 80376 floated outputs have 
internal pull-up resistors. See Resistor 
Recommen- 
dations 
(page 59) for additional information. HOLD 
is not recognized while RESET is active but is recog- 
nized during the time between the high-to-Iow tran- 
sistion of RESET and the first instruction fetch. If 
RESET is asserted while HOLD is asserted, RESET 
has priority and places the bus into an idle state, 
rather than the hold acknowledge (high-impedance) 
state. 


HOLD is a level-sensitive, active HIGH, synchronous 
input. HOLD signals must always meet setup and 
hold times t23 and t24 for correct operation. 


Bus Hold Acknowledge 
(HLDA) 


When active (HIGH), this output indicates the 80376 
has relinquished control of its local bus in response 
to an asserted HOLD signal, and is in the bus Hold 
Acknowledge state. 


The Bus Hold Acknowledge state offers near-com- 
plete signal isolation. In the Hold Acknowledge 
state, HLDA is the only signal being driven by the 
80376. The other output signals or bidirectional sig- 
nals 
(D15-DO, BHE#, 
BLE#, 
A23-A1' 
W/R#, 
D/C#, 
M/IO#, 
LOCK# and ADS#) are in a high- 
impedance state so the requesting bus master may 
control them. These pins remain OFF throughout the 
time that HLDA remains active (see Table 4.3). Pull- 
up resistors may be desired on several signals to 
avoid spurious activity when no bus master is driving 
them. See Resistor 
Recommendations 
(page 59) 
for additional information. 


When the HOLD signal is made inactive, the 80376 
will deactivate HLDA and drive the bus. One rising 
edge on the NMI input is remembered for processing 
after the HOLD input is negated. 


Table 4.3. Output 
Pin State during HOLD 


Pin Value 
Pin Names 


1 
HLDA 
Float 
LOCK#, M/IO#, 
D/C#, 
W/R#, 
ADS#, A23-Al' 
BHE#, BLE#, 


D15-DO 


In addition to the normal usage of Hold Acknowl- 
edge with DMA controllers or master peripherals, 
the near-complete isolation has particular attractive- 
ness during system test when test equipment drives 
the system, and in hardware-fault-tolerant applica- 
tions. 


Hold Latencies 


The maximum possible HOLD latency depends on 
the software being executed. The actual HOLD la- 
tency at any time depends on the current bus activi- 
ty, the state of the LOCK# signal (internal to the 
CPU) activated by the LOCK# prefix, and interrupts. 
The 80376 will not honor a HOLD request until the 
current bus operation is complete. Table 4.4 shows 
the types of bus operations that can affect HOLD 
latency, and indicates the types of delays that 
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these operations may introduce. When considering 
maximum HOLD latencies, designers must select 
which of these bus operations are possible, and 
then select the maximum latency form among them. 


The 80376 breaks 32-bit data or I/O accesses into 2 
internally locked 16-bit bus cycles; the LOCK# sig- 
nal is not asserted. The 80376 breaks unaligned 
16-bit or 32-bit data or I/O accesses into 2 or 3 inter- 
nally locked 16-bit bus cycles. Again the LOCK# 
signal is not asserted but a HOLD request will not be 
recognized until the end of the entire transfer. 


As indicated in Table 4.4, wait states affect HOLD 
latency. The 80376 will not honor a HOLD request 
until the end of the current bus operation, no matter 
how many wait states are required. Systems with 
DMA where data transfer is critical must insure that 
READY# returns sufficiently soon. 


Table 4.4. Locked 
Bus Operations 
Affecting 
HOLD Latency 
in Systems 
Clocks 


COPROCESSOR 
INTERFACE 
SIGNALS 
(PEREQ, 
BUSY#, 
ERROR#) 


In the following sections are descriptions of signals 
dedicated to the numeric coprocessor interface. In 
addition to the data bus, address bus, and bus cycle 
definition signals, these following signals control 
communication 
between 
the 
80376 
and 
the 
80387SX processor extension. 


Coprocessor 
Request 
(PEREQ) 


When asserted (HIGH), this input signal indicates a 
coprocessor request for a data operand to be trans- 
ferred to/from memory by the 80376. In response, 
the 80376 transfers information between the co- 
processor and memory. Because the 80376 has in- 
ternally stored the coprocessor opcode being exe- 
cuted, it performs the requested data transfer with 
the correct direction and memory address. 


PEREa is a level-sensitive active HIGH asynchro- 
nous signal. Setup and hold times, t29 and t30, rela- 
tive to the CLK2 signal must be met to guarantee 
recognition at a particular clock edge. This signal is 
provided with a weak internal pull-down resistor of 
around 20 KO to ground so that it will not float active 
when left unconnected. 


Coprocessor 
Busy (BUSY #) 


When asserted (LOW), this input indicates the co- 
processor is still executing an instruction, and is not 
yet able to accept another. When the 80376 en- 
counters any coprocessor instruction which oper- 
ates on the numerics stack (e.g. load, pop, or arith- 
metic operation), or the WAIT instruction, this input 
is first automatically sampled until it is seen to be 
inactive. This sampling of the BUSY# input prevents 
overrunning the execution of a previous coprocessor 
instruction. 


The F(N)INIT, F(N)CLEX coprocessor instructions 
are allowed to execute even if BUSY# is active, 
since these instructions are used for coprocessor 
initialization and exception-clearing. 


BUSY# is an active LOW, level-sensitive asynchro- 
nous signal. Setup and hold times, t29 and t30, rela- 
tive to the CLK2 signal must be met to guarantee 
recognition at a particular clock edge. This pin is pro- 
vided with a weak internal pull-up resistor of around 
20 KO to Vcc so that it will not float active when left 
unconnected. 


BUSY# serves an additional function. If BUSY# is 
sampled LOW at the falling edge of RESET, the 
80376 processor performs an internal self-test (see 
Bus Activity 
During and Following 
Reset on page 
54). If BUSY# is sampled HIGH, no self-test is per- 
formed. 


Coprocessor 
Error (ERROR #) 


When asserted (LOW), this input signal indicates 
that the previous coprocessor instruction generated 
a coprocessor error of a type not masked by the 
coprocessor's control register. This input is automat- 
ically sampled by the 80376 when a coprocessor 
instruction is encountered, and if active, the 80376 
generates exception 16 to access the error-handling 
software. 


Several coprocessor instructions, generally those 
which clear the numeric error flags in the coproces- 
sor or save coprocessor state, do execute without 
the 80376 generating exception 16 even if ER- 
ROR# 
is active. These instructions are FNINIT, 
FNCLEX, 
FNSTSW, 
FNSTSWAX, 
FNSTCW, 
FNSTENV and FNSAVE. 
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ERROR# 
is an active 
LOW, 
level-sensitive 
asyn- 
chronous 
signal. 
Setup 
and hold times 
t29 and t30, 
relative to the CLK2 signal must be met to guarantee 
recognition 
at a particular 
clock edge. This pin is pro- 
vided with a weak internal 
pull-up resistor 
of around 
20 Kn to Vcc so that it will not float active when left 
unconnected. 


The following 
descriptions 
cover 
inputs that can in- 
terrupt 
or suspend 
execution 
of the processor's 
cur- 
rent instruction 
stream. 


Maskable 
Interrupt 
Request 
(INTR) 


When asserted, 
this input indicates 
a request 
for in- 
terrupt 
service, 
which 
can be masked 
by the 80376 
Flag Register 
IF bit. When 
the 80376 
responds 
to 
the 
INTR 
input, 
it performs 
two 
interrupt 
acknowl- 
edge 
bus 
cycles 
and, 
at the 
end 
of the 
second, 
latches 
an 8-bit interrupt 
vector on Dy-Do 
to identify 
the source 
of the interrupt. 


INTR 
is an active 
HIGH, 
level-sensitive 
asynchro- 
nous signal. Setup and hold times, t27 and t2B, rela- 
tive to the CLK2 
signal 
must 
be met to guarantee 
recognition 
at a particular 
clock edge. To assure rec- 
ognition 
of an INTR 
request, 
INTR 
should 
remain 
active until the first interrupt 
acknowledge 
bus cycle 
begins. 
INTR is sampled 
at the beginning 
of every 


instruction. 
In order to be recognized 
at a particular 
instruction 
boundary, 
INTR must 
be active 
at least 
eight CLK2 clock periods before the beginning 
of the 
execution 
of the instruction. 
If recognized, 
the 80376 
will begin execution 
of the interrupt. 


Non-Maskable 
Interrupt 
Request 
(NMI) 


This 
input 
indicates 
a request 
for interrupt 
service 
which 
cannot 
be 
masked 
by software. 
The 
non- 
maskable 
interrupt 
request 
is always 
processed 
ac- 


cording 
to the pointer or gate in slot 2 of the interrupt 
table. Because 
of the fixed NMI slot assignment, 
no 
interrupt 
acknowledge 
cycles 
are performed 
when 
processing 
NMI. 


NMI is an active 
HIGH, 
rising edge-sensitive 
asyn- 
chronous 
signal. 
Setup 
and hold times, t27 and t2B, 
relative to the CLK2 signal must be met to guarantee 
recognition 
at a particular 
clock edge. To assure rec- 
ognition 
of NMI, it must be inactive 
for at least eight 
CLK2 periods, 
and then be active 
for at least eight 
CLK2 periods 
before the beginning 
of the execution 
of an instruction. 


Once 
NMI 
processing 
has 
begun, 
no 
additional 
NMl's 
are 
processed 
until 
after 
the 
next 
IRET 
in- 
struction, 
which is typically 
the end of the NMI serv- 


ice routine. 
If NMI is re-asserted 
prior to that time, 


however, 
one 
rising 
edge 
on NMI 
will 
be remem- 
bered 
for processing 
after executing 
the next IRET 
instruction. 


Interrupt 
Latency 


The time that elapses 
before 
an interrupt 
request 
is 
serviced 
(interrupt 
latency) 
varies 
according 
to sev- 
eral factors. 
This delay must be taken 
into account 
by the interrupt 
source. 
Any of the following 
factors 
can affect 
interrupt 
latency: 


1. If interrupts 
are masked, 
and 
INTR 
request 
will 
not be recognized 
until interrupts 
are reenabled. 


2. If an NMI is currently 
being serviced, 
an incoming 
NMI request will not be recognized 
until the 80376 
encounters 
the IRET instruction. 


3. An interrupt 
request 
is recognized 
only on an in- 
struction 
boundary 
of the 80376 
Execution 
Unit 
except 
for the following 
cases: 


- 
Repeat 
string 
instructions 
can 
be interrupted 
after each iteration. 


- 
If the instruction 
loads the Stack Segment 
reg- 
ister, an interrupt 
is not processed 
until after 
the following 
instruction, 
which 
should 
be an 
ESP load. This allows 
the entire 
stack 
pointer 
to be loaded without 
interruption. 


- 
If an instruction 
sets the interrupt 
flag (enabling 
interrupts), 
an interrupt 
is not processed 
until 
after the next instruction. 


The longest 
latency 
occurs when the interrupt 
re- 
quest 
arrives 
while 
the 80376 
processor 
is exe- 


cuting a long instruction 
such as multiplication, 
di- 
vision or a task-switch. 


4. Saving the Flags register 
and CS:EIP registers. 
5. If interrupt 
service 
routine 
requires 
a task switch, 
time must be allowed 
for the task switch. 


6. If the interrupt 
service 
routine saves registers 
that 
are not automatically 
saved by the 80376. 


This input signal suspends 
any operation 
in progress 
and places 
the 80376 
in a known 
reset 
state. 
The 
80376 
is reset 
by asserting 
RESET 
for 15 or more 
CLK2 periods 
(80 or more CLK2 periods 
before 
re- 
questing 
self-test). 
When 
RESET is active, 
all other 
input 
pins 
are ignored, 
and 
all other 
bus pins 
are 
driven to an idle bus state as shown 
in Table 4.5. If 
RESET and HOLD are both active at a point in time, 
RESET 
takes 
priority 
even 
if the 
80376 
was 
in a 
Hold Acknowledge 
state prior to RESET active. 


RESET 
is an active 
HIGH, 
level-sensitive 
synchro- 
nous signal. Setup and hold times, t25 and t26, must 
be met in order 
to assure 
proper 
operation 
of the 
80376. 
. 
. 
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Pin Name 
Signal Level during RESET 


AOS# 
1 


015-00 
Float 


BHE#, BLE# 
0 


A23-A1 
1 


W/R# 
0 


O/C# 
1 


M/IO# 
0 


LOCK# 
1 


HLOA 
0 


All data transfers occur as a result of one or more 
bus cycles. Logical data operands of byte and word 
lengths may be transferred without restrictions on 
physical address alignment. Any byte boundary may 
be used, although two physical bus cycles are per- 
formed as required for unaligned operand transfers. 


The 80376 processor address signals are designed 
to simplify external system hardware. BHE# and 
BLE# provide linear selects for the two bytes of the 
16-bit data bus. 


Byte Enable outputs BHE# and BLE# are asserted 
when their associated data bus bytes are involved 
with the present bus cycle, as listed in Table 4.6. 


Table 4.6. Byte Enables and Associated 
Data and Operand Bytes 


Byte Enable 
Associated Data Bus Signals 


BHE# 
015 - 08 (Byte 1-Most 
Significant) 
BLE# 
07-00 
(Byte Q-Least Significant) 


Each bus cycle is composed of at least two bus 
states. Each bus state requires one processor clock 
period. Additional bus states added to a single bus 
cycle are called wait states. See Bus Functional 
Description 
(page 39) for additional information. 


Bus cycles may access physical memory space or 
I/O space. Peripheral devices in the system may ei- 
ther be memory-mapped, or I/O-mapped, or both. 
As shown in Figure 4.3, physical memory addresses 
range from OOOOOOH 
to OFFFFFFH(16 Mbytes) and 
I/O 
addresses 
from 
OOOOOOHto 
OOFFFFH 
(64 Kbytes). Note the I/O addresses used by the 
automatic I/O cycles for coprocessor communica- 
tion are 8000F8H to 8000FFH, beyond the address 
range of programmed I/O, to allow easy generation 
of a coprocessor chip select signal using the A23 
and M/IO# 
signals. 


With the flexibility of memory addressing on the 
80376, it is possible to transfer a logical operand 
that spans more than one physical Dword or word of 
memory or I/O. Examples are 32-bit Dword or 16-bit 
word operands beginning at addresses not evenly 
divisible by 2. 


Operand alignment and size dictate when multiple 
bus cycles are required. Table 4.6a describes the 
transfer cycles generated for all combinations of log- 
ical operand lengths and alignment. 


Table 4.6a. Transfer Bus Cycles 
for Bytes, Words and Dwords 
Byte-Length of Logical Operand 
1 
2 
4 
Physical Byte 
~ddress in 
xx 
00 
01 
10 
11 
00 
01 
10 
11 
Memory 
(Low-Order 
Bits) 


rans/er 
b 
w 
Ib, 
w 
hb, 
Iw, 
hb, 
hw, mw, 


Cycles 
hb 
I.b 
hw 
Ib, 
Iw 
hb, 


mw 
Ib 


Key: 
b = byte transfer 
w = word 
transfer 
I = low-order 
portion 
m = mid-order 
portion 
x = don't 
care 
h = high-order 
portion 
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~ 
I 
NOT /: 


~ 
8000FFH I 
--+- COPROCESSOR 
8000F8H. 
_ 
,,-.,m 
('0") 
~ 


/NOT/). 


~.~ 


OOtFtFHB } 
ACCESSIBLE 
64 kBYTE 
PROGRA~~ED 
OOOOOOH'--_..... 
OOOOOOH 
I/O SPACE 


PHYSICAL ~E~ORY SPACE 
I/O SPACE 


NOTE: 
Since A23 is HIGH during 
automatic 
communication 
with coprocessor, 
A23 HIGH and MilOii' 
LOW can be used to easily 
generate 
a coprocessor 
select 
signa1. 


4.4 Bus Functional 
Description 


The 80376 has separate, parallel buseSfor data and 
address. The data bus is 16 bits in width, and bidi- 
rectional. The address bus provides a 24-bit value 
using 23 signals for the 23 upper-order address bits 
and 2 Byte Enable signals to directly indicate the 
active bytes. These buses are interpreted and con- 
trolled by several definition signals. 


The definition of each bus cycle is given by three 
signals: M/IO#, 
W/R# 
and D/C#. 
At the same 
time, a valid address is present on the byte enable 
signals, BHE# and BLE#, and the other address 
signals A23-A1. A status signal, ADS#, 
indicates 
when the 80376 issues a new bus cycle definition 
and address. 


Collectively, the address bus, data bus and all asso- 
ciated control signals are referred to simply as "the 
bus". When active, the bus performs one of the bus 
cycles below: 
1. Read from memory space 
2. Locked read from memory space 
3. Write to memory space 
4. Locked write to memory space 


5. Read from I/O space (or coprocessor) 
6. Write to I/O space (or coprocessor) 
7. Interrupt acknowledge (always locked) 
8. Indicate halt, or indicate shutdown 


Table 4.2 shows the encoding of the bus cycle defi- 
nition signals for each bus cycle. See Bus Cycle 
Definition 
Signals (page 35) for additonal informa- 
tion. 


When the 80376 bus is not performing one of the 
activities listed above, it is either Idle or in the Hold 
Acknowledge state, which may be detected by ex- 
ternal circuitry. The idle state can be identified by the 
80376 giving no further assertions on its address 
strobe output (ADS#) 
since the beginning of its 
most recent bus cycle, and the most recent bus cy- 
cle having been terminated. The hold acknowledge 
state is identified by the 80376 asserting its hold ac- 
knowledge (HLDA) output. 


The shortest time unit of bus activity is a bus state. A 
bus state is one processor clock period (two CLK2 
periods) in duration. A complete data transfer occurs 
during a bus cycle, composed of two or more bus 
states. 
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CYCLE 1 
CYCLE 2 
CYCLE 3 
NON-PIPELINED 
NON-PIPELINED 
NON-PIPELINED 
(READ) 
(READ) 
(READ) 


T1 
T2 
T1 
T2 
T1 
T2 


.11.2 
.11.2 
.11.2 
.11.2 
.11.2 
.11.2 
.1 


CLK2 [ 
(INPUT) 


BHEN,BLEN,A1-A23 
[ 
M/ION, D/CN, W/RN 
(OUTPUTS) 


ADSN [ 
(OUTPUT) 


NAN [ 
(INPUT) 


READYN[ 
(INPUT) 


LOCKN[ 
(OUTPUT) 


DO-D1S[ 
(INPUT DURINGREAD) 


Figure 
4.4. Fastest 
Read Cycles 
with 
Non-Plpellned 
Timing 


The fastest 
80376 
bus cycle 
requires-only 
two bus 
selectable 
on a cycle-by-cycle 
basis with the 
Next 
states. 
For example, 
three consecutive 
bus read cy- 
Address 
(NA #) input. 


cles, each consisting 
of two bus states, 
are shown 
by Figure 
4.4. 
The 
bus 
states 
in each 
cycle 
are 
named T1 and T2. Any memory 
or 110 address 
may 
be accessed 
by such 
a two-state 
bus cycle, 
if the 
external 
hardware 
is fast enough. 


Every 
bus cycle 
continues 
until it is acknowledged 
by the external 
system 
hardware, 
using the 80376 
READY # input. Acknowledging 
the bus cycle at the 
end of the first T2 results 
in the shortest 
bus cycle, 
requiring 
only T1 and T2. If READY # is not immedi- 
ately asserted 
however, 
T2 states 
are repeated 
in- 
definitely 
until the READY # input is sampled 
active. 


The pipe lining option 
provides 
a choice 
of bus cycle 
timings. 
Pipelined 
or 
non-pipelined 
cycles 
are 


When 
pipelining 
is selected 
the 
address 
(BHE#, 


BLE# 
and A23-A1) 
and definition 
(W/R#, 
D/C#, 


M/IO# 
and LOCK#) 
of the next cycle are available 
before 
the end of the current 
cycle. 
To signal their 
availability, 
the 80376 address status output (ADS#) 
is asserted. 
Figure 4.5 illustrates 
the fastest 
read cy- 
cles with pipelined 
timing. 


Note 
from 
Figure 
4.5 the fastest 
bus cycles 
using 
pipe lining 
require 
only two 
bus states, 
named 
T1P 
and T2P. Therefore 
pipe lined cycles allow the same 
data 
bandwidth 
as 
non-pipelined 
cycles, 
but 
ad- 
dress-to-data 
access 
time 
is 
increased 
by 
one 
T-state 
time compared 
to that of a non-pipe lined cy- 
cle. 
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CYCLE 1 
CYCLE2 
CYCLE3 
PIPELINED 
PIPELINED 
PIPELINED 
(READ) 
(READ) 
(READ) 


TIP 
T2P 
TIP 
T2P 
TIP 
T2P 


.11.2 
.11.2 
.11.2 
.11.2 
.11.2 
.11.2 


CLK2 [ 
(INPUT) 


BHE#,BLE#,Al-A23 
[ 
"'/10#, 
D/C#, w/R# 
(OUTPUTS) 


ADS# [ 
(OUTPUT) 


NA# [ 
(INPUT) 


READY#[ 
(INPUT) 


LOCK# [ 
(OUTPUT) 


DO-D1S[ 
(INPUT DURINGREAD) 


Figure 4.5. Fastest 
Read Cycles with Plpellned 
Timing 


READ 
AND WRITE 
CYCLES 
ment for the speed of any external 
device. 
External 
hardware, 
which 
has decoded 
the address 
and bus 
cycle type, asserts 
the READY # input at the appro- 
priate time. 
Data transfers 
occur as a result of bus cycles, classi- 
fied as read or write cycles. 
During read cycles, data 
is transferred 
from an external 
device to the proces- 
sor. During write cycles, 
data is transferred 
from the 
processor 
to an external 
device. 


Two choices 
of bus cycle timing are dynamically 
se- 
lectable: 
non-pipe lined or pipelined. 
After an idle bus 
state, the processor 
always 
uses non-pipe lined tim- 
ing. However 
the NA# 
(Next Address) 
input may be 
asserted 
to select 
pipelined 
timing for the next bus 
cycle. 
When 
pipelining 
is selected 
and 
the 
80376 
has a bus request 
pending 
internally, 
the 
address 
and 
definition 
of the 
next 
cycle 
is made 
available 
even before 
the current 
bus cycle 
is acknowledged 
by READY#. 


Terminating 
a read or write cycle, like any bus cycle, 
requires 
acknowledging 
the cycle 
by asserting 
the 
READY # input. 
Until acknowledged, 
the processor 
inserts wait states into the bus cycle, to allow adjust- 


At the end of the second 
bus state 
within 
the bus 
cycle, 
READY # is sampled. 
At that time, if external 
hardware 
acknowledges 
the bus cycle 
by asserting 
READY #, the bus cycle terminates 
as shown in Fig- 
ure 4.6. If READY# 
is negated 
as in Figure 4.7, the 
80376 executes 
another 
bus state (a wait state) and 
READY # is sampled 
again at the end of that state. 
This continues 
indefinitely 
until the cycle is acknowl- 
edged by READY # asserted. 


When the current 
cycle is acknowledged, 
the 80376 
terminates 
it. When 
a read cycle 
is acknowledged, 
the 80376 latches 
the information 
present 
at its data 
pins. When a write cycle is acknowledged, 
the write 
data of the 80376 
remains 
valid throughout 
phase 
one of the next bus state, to provide 
write data hold 
time. 


PROCESSOR CLK [ 


BHE #,BLE #. [ 
Al-A2:3. 
~/IO#.D/C# 


ADS# 
[ 


NA# 
[ 


IDLE I 
CYCLE 1 
NON-PIPELINED 
(WRITE) 


IDLE I 
CYCLE 4 
NON-PIPELINED 
(READ) 


IDLE 
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CYCLE :3 
NON-PIPELINED 
(WRITE) 


T1 
T2 


CYCLE 2 
I 
NON-PIPELINED 
(READ) 


T1 
T2 


Idle states 
are shown 
here for diagram 
variety 
only. Write cycles 
are not always 
followed 
by an idle state. An active 
bus 
cycle 
can immediately 
follow 
the write cycle. 


Non-Plpelined Bus Cycles 


Any bus cycle may be performed with non-pipelined 
timing. For example, Figure 4.6 shows a mixture of 
non-pipelined read and write cycles. Figure 4.6 
shows that the fastest possible non-pipelined cycles 
have two bus states per bus cycle. The states are 
named T1 and T2. In phase one of T1. the address 
signals and bus cycle definition signals are driven 
valid and, to signal their availability, address strobe 
(ADS#) is simultaneously asserted. 


During read or write cycles. the data bus behaves as 
follows. If the cycle is a read, the 80376 floats its 
data signals to allow driving by the external device 
being addressed. The 80376 requires that all data 
bus pins be at a valid logic state (HIGH or LOW) 
at the end of each read cycle, when READY# is 
asserted. The system MUST be designed to 
meet this requirement. If the cycle is a write, data 
signals are driven by the 80376 beginning in phase 
two of T1 until phase one of the bus state following 
cycle acknowledgement. 


inter 
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Idle states 
are shown 
here for diagram 
variety 
only. Write cycles 
are not always 
followed 
by an idle state. 
An active 
bus 
cycle 
can immediately 
follow 
the write cycle. 


Figure 4.7. Various 
Non-Plpellned 
Bus Cycles (Various 
Number 
of Walt States) 


Figure 4.7 illustrates non-pipelined bus cycles with 
last one. as shown in Figure 4.7, Cycles 2 and 3. If 
one wait state added to Cycles 2 and 3. READY# is 
NA# is sampled active during a T2 other than the 
sampled inactive at the end of the first T2 in Cycles 
last one, the next state would be T21or T2P instead 
2 and 3. Therefore Cycles 2 and 3 have T2 repeated 
of another T2. 


again. At the end of the second T2. READY# is 
sampled active. 


When address pipelining is not used. the address 
and bus cycle definition remain valid during all wait 
states. When wait states are added and it is desir· 
able to maintain non-pipelined timing, it is necessary 
to negate NA# 
during each T2 state except the 


When address pipelining is not used. the bus states 
and transitions are completely illustrated by Figure 
4.8. The bus transitions 
between four possible. 


states, n, T2, Tj, and Th. Bus cycles consist of T1 
and T2, with T2 being repeated for wait states. Oth- 
erwise the bus may be idle, Ti. or in the hold ac- 
knowledge state Th. 


HOLD NEGATED- 
~~O'" 
REQUEST PENDING 
'<O!'""S' 
"S' 
'1'(: 
S'~-1'~-1'~ 
~OO_ 


READYIII ASSERTED- 
HOLD NEGATED. 
REQUEST PENDING 


READYllI NEGATED- 
NAllINEGATED 
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Bus States: 
T1-first 
clock of a non-pipelined bus cycle (80376 drives new address and asserts ADS#). 


T2-subsequent 
clocks of a bus cycle when NA# has not been sampled asserted in the current bus cycle. 


Ti-idle 
state. 


Th-hold 
acknowledge state (80376 asserts HLDA). 
The fastest bus cycle consists of two states: T1 and T2. 
Four basic bus states describe bus operation when not using pipelined address. 


Figure 4.8. 80376 Bus States (Not Using Pipe lined 
Address) 


Bus cycles always begin with T1. T1 always leads to 
nally pending bus cycle before the current bus cycle 
T2.lf a bus cycle is not acknowledged during T2 and 
is acknowledged with READY# asserted. ADS# is 
NA# 
is inactive, T2 is repeated. When a cycle is 
asserted by the 80376 when the next address is is- 
acknowledged during T2, the following state will be 
sued. The pipelining option is controlled on a cycle- 


T1 of the next bus cycle if a bus request is pending 
by-cycle basis with the NA# input signal. 


internally, or Ti if there is no bus request pending, or 
Th if the HOLD input is being asserted. 
Once a bus cycle is in progress and the current ad- 
dress has been valid for at least one entire bus 
state, the NA# input is sampled at the end of every 
phase one until the bus cycle is acknowledged. Dur- 
ing non-pipelined bus cycles NA# is sampled at the 
end of phase one in every T2. An example is Cycle 2 
in Figure 4.9, during which NA# is sampled at the 
end of phase one of every T2 (it was asserted once 
during the first T2 and has no further effect during 
that bus cycle). 


Use of pipelining allows the 80376 to enter three 
additional bus states not shown in Figure 4.8. Figure 
4.12 on page 49 is the complete bus state diagram, 
including pipelined cycles. 


Plpelined 
Bus Cycles 


Pipelining is the option of requesting the address 
and the bus cycle definition of the next inter- 


inter 
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Following any idle bus state (Ti), bus cycles are non-pipelined. Within non-pipelined bus cycles, NA# is only sampled 
during wait states. Therefore, to begin pipelining during a group of non-pipelined bus cycles requires a non-pipelined 
cycle with at least one wait state (Cylcle 2 above). 


If NA# 
is sampled 
active. the 80376 
is free to drive 
the address 
and bus cycle definition 
of the next bus 
cycle. 
and assert 
ADS #. 
as soon 
as it has a bus 
request 
internally 
pending. 
It may drive the next ad- 
dress 
as early 
as the 
next bus state, 
whether 
the 
current 
bus cycle 
is acknowledged 
at that 
time 
or 
not. 


Regarding 
the details 
of pipelining, 
the 80376 
has 
the following 
characteristics: 


1. The next address 
and status may appear as early 
as the bus state 
after 
NA # was sampled 
active 
(see Figures 4.9 or 4.10). In that case, state T2P 
is entered 
immediately. 
However. 
when 
there 
is 
not an internal 
bus request 
already 
pending, 
the 
next address 
and status 
will not be available 
im- 
mediately 
after 
NA # 
is asserted 
and T21 is en- 
tered 
instead 
of T2P 
(see Figure 4.11 Cycle 
3). 


Provided 
the current 
bus cycle 
isn't yet acknow- 


ledged by READY # asserted. 
T2P will be entered 
as soon as the 80376 does drive the next address 
and 
status. 
External 
hardware 
should 
therefore 
observe 
the 
ADS# 
output 
as confirmation 
the 
next address 
and status are actually 
being driven 
on the bus. 


2. Any address 
and status which 
are validated 
by a 
pulse on the 80376 ADS # output will remain sta- 
ble on the address 
pins for at least two processor 
clock 
periods. 
The 80376 
cannot 
produce 
a new 
address 
and 
status 
more 
frequently 
than 
every 
two 
processor 
clock 
periods 
(see 
Figures 
4.9, 
4.10 and 4.11). 


3. Only the address 
and bus cycle 
definition 
of the 
very next bus cycle is available. 
The pipelining 
ca- 
pability 
cannot 
look 
further 
than 
one 
bus cycle 
ahead 
(see Figure 4.11, Cycle 1). 


CYCLE 1 
NON-PIPELINED 
(WRITE) 


CYCLE 2 
PIPELINED 
(READ) 
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PIPELINED 
(WRITE) 


CYCLE 4 
PIPELINED 
(READ) 


ADS II [ 


CLK2 [ 
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LOCKII 
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Following 
any idle bus state 
(Ti) the bus cycle 
is always 
non·pipelined 
and NA# 
is only sampled 
during 
wait states. 
To 
start, address 
pipelining 
after an idle state 
requires 
a non·pipelined 
cycle 
with at least one wait state 
(cycle 
1 above). 


The pipelined 
cycles 
(2, 3, 4 above) 
are shown 
with various 
numbers 
of wait states. 


Figure 4.10. Fastest Transition 
to Plpellned 
Bus Cycle Following 
Idle Bus State 


The complete bus state transition diagram, including 
a pipelined bus cycle T1P. From an idle state. Ti. the 
pipelining is given by Figure 4.12. Note it is a super- 
first bus cycle must begin with T1, and is therefore a 
set of the diagram for non-pipelined only. and the 
non-pipelined bus cycle. The next bus cycle will be 
three additional bus states for pipelining are drawn 
pipelined, however. provided NA# is asserted and 
in bold. 
the first bus cycle ends in a T2P state (the address 
and status for the next bus cycle is driven during 
T2P). The fastest path from an idle state to a pipe- 
lined bus cycle is shown in bold below: 


The fastest bus cycle with pipelining consists of just 
two bus states, T1P and T2P (recall for non-pipe- 
lined it is T1 and T2). T1P is the first bus state of a 
pipelined cycle. 


Initiating 
and Maintaining 
Plpellned 
Bus Cycles 


Using the state diagram Figure 4.12. observe the 
transitions from an idle state, Ti, to the beginning of 


idle 
non-pipelined 
states 
cycle 
pipelined 
cycle 
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PIPELINED 


(WRITE) 


T2P 
T2P 
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T2 
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CYCLE. 
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CLK2 [ 
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ASSERTING NAN ~ORE 
THAN ONCE DURING 
ANY CYCLE HAS NO 
ADDITIONAL EFFECTS 
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BEEN ASSERTED 
IN T1P IF DESIRED. 
ASSERTION NOW IS 
THE LATEST T1~E 
POSSIBLE TO ALLOW 
80376 
TO ENTER T2P 
STATE TO ~AINTAIN 
PIPELINING IN CYCLE 3 


LOCKN 
[ 


00- 015 [ 


Figure 4.11. Details of Address 
Plpellnlng 
during Cycles with Walt States 


T1- T2- T2P are the states of the bus cycle that es- 
The transition to pipelined address is shown func- 
tablishes address pipelining for the next bus cycle, 
tionally by Figure 4.10, Cycle 1. Note that Cycle 1 is 
which begins with T1P. The same is true after a bus 
used to transition into pipelined address timing for 
hold state, shown below: 
the subsequent Cycles 2, 3 and 4, which are pipe- 
lined. The NA# input is asserted at the appropriate 
Th, Th, Th, 
T1-T2-T2P, 
T1P-T2P, 
time to select address pipelining for Cycles 2.3 and 
4. 


hold aknowledge non-pipelined 
states 
cycle 
pipelined 
cycle 
Once a bus cycle is in progress and the current ad- 
dress and status has been valid for one entire bus 
state, the NA# input is sampled at the end of every 
phase one until the bus cycle is acknowledged. 


READY' ASSERTED' 
HOLD NEGATED' 
NO REQUEST 


Bus States: 
T1-first 
clock of a non-pipelined bus cycle (80376 drives new address, status and asserts ADS#). 


T2-subsequent 
clocks of a bus cycle when NA# has not been sampled asserted in the current bus cycle. 


T21-subsequent clocks of a bus cycle when NA# has been sampled asserted in the current bus cycle but there is not 
yet an internal bus request pending (80376 will not drive new address, status or assert ADS#). 
T2P-subsequent 
clocks of a bus cycle when NA# has been sampled asserted in the current bus cycle and there is an 


internal bus request pending (80376 drives new address, status and asserts ADS#). 
T1P-first 
clock of a pipelined bus cycle. 


Ti--idle state. 
Th-hold 
acknowledge state (80376 asserts HLDA). 


Asserting NA# for pipelined bus cycles gives access to three more bus states: T21,T2P and T1P. 
Using pipelining the fastest bus cycle consists of T1P and T2P. 


Sampling begins in T2 during Cycle 1 in Figure 4.10. 
Once NA*" is sampled active during the current cy- 
cle, the 80376 is free to drive a new address and bus 
cycle definition on the bus as early as the next bus 
state. In Figure 4.10, Cycle 1 for example, the next 
address and status is driven during state T2P. Thus 
Cycle 1 makes the transition to pipelined timing, 
since it begins with T1 but ends with T2P. Because 
the address for Cycle 2 is available before Cycle 2 
begins, Cycle 2 is called a pipelined bus cycle, and it 
begins with T1P. Cycle 2 begins as soon 
as 
READY*" asserted terminates Cycle 1. 


Examples of transition bus cycles are Figure 4.10, 
Cycle 1 and Figure 4.9, Cycle 2. Figure 4.10 shows 
transition during the very first cycle after an idle bus 
state, which is the fastest possible transition into ad- 
dress pipelining. Figure 4.9, Cycle 2 shows a tran- 
sition cycle occurring during a burst of bus cycles. In 
any case, a transition cycle is the same whenever it 
occurs: it consists at least of T1, T2 (NA*" is assert- 
ed at that time), and T2P (provided the 80376 has an 
internal bus request already pending, which it almost 
always has). T2P states are repeated if wait states 
are added to the cycle. 


Note that only three states (T1, T2 and T2P) are 
required in a bus cycle performing a transition 
from 
non-pipelined into pipelined timing, for example Fig- 
ure 4.10, Cycle 1. Figure 4.10, Cycles 2, 3 and 4 
show that pipelining can be maintained with two- 
state bus cycles consisting only of T1P and T2P. 


Once a pipelined bus cycle is in progress, pipelined 
timing is maintained for the next cycle by asserting 
NA*" and detecting that the 80376 enters T2P dur- 
ing the current bus cycle. The current bus cycle must 
end in state T2P for pipeliriing to be maintained in 
the next cycle. T2P is identified by the assertion of 
ADS*". Figures 4.9 and 4.10 however, each show 


pipelining ending after Cycle. 4 because Cycle 4 
ends in T21.This indicates the 80376 didn't have an 
internal bus request prior to the acknowledgement 
of Cycle 4. If a cycle ends with a T2 or T21,the next 
cycle will not be pipelined. 


Realistically, pipelining is almost always maintained 
as long as NA*" is sampled asserted. This is so be- 
cause in the absence of any other request, a code 
prefetch request is always internally pending until 
the instruction decoder and code prefetch queue are 
completely full. Therefore pipelining is maintained 
for long bursts of bus cycles, if the bus is available 
(Le., HOLD inactive) and NA*" is sampled active in 
each of the bus cycles. 


In repsonse to an interrupt request on the INTR in- 
put when interrupts are enabled, the 80376 performs 
two interrupt acknowledge cycles. These bus cycles 
are similar to read cycles in that bus definition sig- 
nals define the type of bus activity taking place, and 
each 
cycle 
continues 
until 
acknowledged 
by 
READY*" sampled active. 


The state of A2 distinguishes the first and second 
interrupt acknOWledge cycles. The byte address 
driven during the first interrupt acknowledge cycle is 
4 (A23-A3' Al' BLE*" LOW, A2 and BHE*" HIGH). 
The byte address driven during the second interrupt 
acknowledge cycle is 0 (A23-Al' 
BLE*" LOW and 
BHE*" HIGH). 


The LOCK*" output is asserted from the beginning 
of the first interrupt acknowledge cycle until the end 
of the second interrupt acknowledge cycle. Four idle 
bus states, Tj, are inserted by the £J0376between 
the two interrupt acknowledge cycles for compatibil- 
ity with the interrupt specification TRHRL 
of the 
8259A Interrupt Controller and the 82370 Integrated 
Peripheral. 


inter 
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T1 
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ACKNOWLEDGE 
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T2 
T2 
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[ 
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IGNORED 
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VECTOR 
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IGNORED 
--<:p--- 


Interrupt Vector (0-255) is read on DO-D7 at end of second Interrupt Acknowledge bus cycle. 
Because each Interrupt Acknowledge bus cycle is followed by idle bus states, asserting NA# has no practical effect. 
Choose the approach which is simplest for your system hardware design.' 


During both interrupt acknowledge cycles, D15-DO 
float. No data is read at the end of the first interrupt 
acknowledge cycle. At the end of the second inter- 
rupt acknowledge cycle, the 80376 will read an ex- 
ternal interrupt vector from D7-DO of the data bus. 
The vector indicates the specific interrupt number 
(from 0-255) requiring service. 


The 80376 execution unit halts as a result of execut- 
ing a HLT instruction. Signaling its entrance into the 
halt state, a halt indication cycle is performed. The 
halt indication cycle is identified by the state of the 
bus definition signals shown on page 34, Bus Cycle 
Definition 
Signals, 
and a byte address of 2. The 
halt indication cycle must be acknowledged 
by 
READY# asserted. A halted 80376 resumes execu- 
tion when INTR (if interrupts are enabled), NMI or 
RESET is asserted. 
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NOTE: HALT CYCLE MUST BE 
ACKNOWLEDGED BY READYI 
ASSERTED. WAIT STATES MAY 
BE ADDED TO THE CYCLE If 
DESIRED. 


80376 
REMAINS HALTED 
UNTIL INTR. NMI OR 
RESET IS ASSERTED. 
I 
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80376 
RESPONDS TO 
HOLD INPUT WHILE IN 
THE HALT STATE. 


-~rLOATINGr--- 
---- 


ENTERING 
AND EXITING 
HOLD 
ACKNOWLEDGE 
The 80376 shuts down as a result of a protection 
fault while attempting to process a double fault. Sig- 
naling its entrance into the shutdown state. a shut- 
down indication cycle is performed. The shutdown 
indication cycle is identified by the state of the bus 
definition signals shown on page 34 Bus Cycle Def· 
Inltlon 
Signals and a byte address of O. The shut- 
down indication cycle must be acknowledged by 
READY# asserted. A shutdown 80376 resumes ex- 
ecution when NMI or RESET is asserted. 


The bus hold acknowledge state, Th. is entered in 
response to the HOLD input being asserted. In the 
bus hold acknowledge state, the 80376 floats all 
outputs or bidirectional signals, except for HLDA. 
HLDA is asserted as long as the 80376 remains in 
the bus hold acknowledge state. In the bus hold ac- 
knowledge state. all inputs except HOLD and RE- 
SET are ignored. 


inter 
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- -(FLOATING)- 
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Th may be entered 
from a bus idle state as in Figure 
4.16 
or after 
the 
acknowledgement 
of the current 
physical 
bus cycle if the LOCK # signal is not assert- 
ed, as in Figures 4.17 and 4.18. 


state will be T1 if a bus request 
is internally 
pending, 


as in Figures 4.17 and 4.18. This 
exited in response 
to RESET being asserted. 


This 
exited 
in response 
to the 
HOLD 
input 
being 
negated. 
The following 
state will be Tj as in Figure 
4.16 if no bus request 
is pending. 
The following 
bus 


If a rising 
edge 
occurs 
on the edge-triggered 
NMI 
input while in Th, the event is remembered 
as a non- 
maskable 
interrupt 
2 and is serviced 
when This 
exit- 
ed unless the 80376 
is reset before This 
exited. 


irltJ 
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NOTE: 
For maximum 
design 
flexibility 
the 80376 
has no internal 
pull-up 
resistors 
on its outputs. 
Your 
design 
may require 
an 
external 
pullup 
on ADS'" 
and other 
80376 
outputs 
to keep them 
negated 
during 
float periods. 


wise perform 
its first bus cycle. If HOLD remains 
as- 
serted when RESET is inactive, 
the BUSY", 
input is 
still sampled 
as usual to determine 
whether 
a self 
test is being requested. 
RESET being asserted 
takes priority over HOLD be- 
ing asserted. 
If RESET 
is asserted 
while 
HOLD re- 
mains asserted, 
the 80376 drives its pins to defined 
states 
during reset, as in Table 
4.5, Pin State 
Dur- 
in~1 Reset, 
and 
performs 
internal 
reset 
activity 
as 
usual. 


BUS ACTIVITY 
DURING 
AND FOllOWING 
RESET 


If HOLD remains 
asserted 
when 
RESET is inactive, 
thE' 80376 enters the hold acknowledge 
state before 
performing 
its first bus cycle, 
provided 
HOLD is still 
asserted 
when 
the 
80376 
processor 
would 
other- 


RESET is the highest priority input signal, capable 
of 
interrupting 
any processor 
activity 
when it is assert- 
ed. A bus cycle 
in progress 
can be aborted 
at any 
stage, or idle states or bus hold acknowledge 
states 
discontinued 
so that the reset state is established. 
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NOTE: 
HOLD is a synchronous input and can be asserted at any CLK2 edge, provided setup and hold (t23 and t24) require- 
ments are met. This waveform is useful for determining Hold Acknowledge latency. 


Figure 4.17. Requesting 
Hold from Active 
Bus (NA,* 
Inactive) 


RESET should remain asserted for at least 15 CLK2 
periods followed by a self-test may cause the self- 
periods to ensure it is recognized throughout the 
test to report a failure when no true failure exists. 
80376. and at least 80 CLK2 periods if a 80376 self· 
test is going to be requested at the falling edge. RE- 
SET asserted pulses less than 15 CLK2 periods may 
not be recognized. RESET pulses less than 80 CLK2 


Provided the RESET falling edge meets setup and 
hold times t25 and t26. the internal processor clock 
phase is defined at that time as illustrated by Figure 
4.19 and Figure 6.7. 
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NOTE: 
HOLD 
is a synchronous 
input 
and can be asserted 
at any CLK2 
edge, 
provided 
setup 
and 
hold 
(t23 and t24) require- 
ments 
are met. This waveform 
is useful 
for determining 
Hold Acknowledge 
latency. 


Figure 4.18. Requesting 
Hold from Idle Bus (NA# 
Active) 


An 80376 self-test may be requested at the time RE- 
problem. the 80376 attempts to proceed with the 
SET goes inactive by having the BUSY# input at a 
reset sequence afterwards. 


LOW level as shown in Figure 4.19. The self-test 
requires (220 + approximately 60) CLK2 periods to 
complete. The self-test duration is not affected by 
the test results. Even if the self-test indicates a 


After the RESET falling edge (and after the self-test 
if it was requested) the 80376 performs an internal 
initialization sequence for approximately 350 to 450 
CLK2 periods. 


intJ 


RESET 


~ 15 
CLK2 
DURATION 
IF 
NOT GOING TO REQUEST 
SELF-TEST. 


NO SELF-TEST 


(NOTE 
1) 


LOW TO BEGIN SELF-TEST(NOTE 
2) 


ERROR# [ 


BHE#, 
BLE#, 
W/RI. 
t.4/IO#, 
[ 
HLDA 


Al-A23, 
[ 
D/CI,LOCK# 


UP TO 30 
CLK2- 


ADSI 
[ 
HIGH 
DURING RESET 


NOTES: 
1. BUSY# should be held stable for 8 CLK2 periods before and after the CLK2 period in which RESET falling edge 
occurs. 
2. If self-test is requested, the 80376 outputs remain in their reset state as shown here. 


4.5 Self-Test 
Signature 


Upon completion of self-test (if self-test was re- 
quested by driving BUSY/I LOW at the falling edge 
of RESET) the EAX register will contain a signature 
of OOOOOOOOH 
indicating the 80376 passed its self- 
test of microcode and major PLA contents with no 
problems detected. The passing signature in EAX, 
OOOOOOOOH, 
applies to all 80376 revision levels. Any 
non-zero signature indicates the 80376 unit is faulty. 


4.6 
Component 
and Revision 
Identifiers 


To assist 80376 users, the 80376 after reset holds a 
component identifier and revision identifier in its OX 
register. The upper 8 bits of DX hold 33H as identifi- 
cation of the 80376 component. (The lower nibble, 
03H, refers to the Intel386™ architecture. The up· 
per nibble, 30H, refers to the third member of the 
Intel386 family). The lower 8 bits of DX hold an 
8·bit 
unsigned 
binary 
number 
related 
to 
the 


intJ 


component revision level. The revision identifier will, 
in general, chronologically track those component 
steppings which are intended to have certain im- 
provements or distinction from previous steppings. 
The 80376 revision identifier will track that of the 
EI0386where possible. 


The revision identifier is intended to assist 80376 
users to a practical extent. However, the revision 
identifier value is not guaranteed to change with ev- 
Elrystepping revision, or to follow a completely uni- 
form numerical sequence, depending on the type or 
intention of revision, or manufacturing materials re- 
quired to be changed. Intel has sole discretion over 
these characteristics of the component. 


Table 4.7. Component and 
Revision Identifier History 


80376 Stepping Name 


AO 


Revision Identifier 


05H 


4.7 Coprocessor 
Interfacing 


The 80376 provides an automatic interface for the 
Intel 80387SX numeric floating-point coprocessor. 
The 80387SX coprocessor uses an I/O mapped in- 
terface driven automatically by the 80376 and as- 
sisted by three dedicated signals: BUSYII, ER- 
I~ORII and PEREQ. 


As the 80376 begins supporting a coprocessor in- 
struction, it tests the BUSYII and ERRORII signals 
to determine if the coprocessor can accept its next 
instruction. Thus, the BUSYII and ERRORII inputs 
eliminate the need for any "preamble" bus cycles 
for communication between processor and coproc- 
13ssor.The 80387SX can be given its command op- 
code immediately. The dedicated signals provide in- 
struction synchronization, and eliminate the need of 
using the 80376 WAIT opcode (9BH) for 80387SX 
instruction synchronization (the WAIT opcode was 
required when the 8086 or 8088 was used with the 
8087 coprocessor). 


Custom coprocessors can be included in 80376 
based systems by memory-mapped or I/O-mapped 
interfaces. Such coprocessor interfaces allow a 
completely custom protocol, and are not limited to a 
set of coprocessor protocol "primitives". 
Instead, 


memory-mapped or I/O-mapped interfaces may use 
all applicable 80376 instructions for high-speed co- 
processor 
communication. 
The 
BUSYII 
and 


ERRORII inputs of the 80376 may also be used for 
the custom coprocessor interface, if such hardware 
assist is desired. These signals can be tested by the 
80376 WAIT opcode (9BH). The WAIT instruction 
will wait until the BUSYII input is inactive (interrupta- 
ble by an NMI or enabled INTR input), but generates 
an exception 16 fault if the ERRORII pin is active 
when the BUSYII goes (or is) inactive. If the custom 
coprocessor interface is memory-mapped, protec- 
tion of the addresses used for the interface can be 
provided with the segmentation mechanism of the 
80376. If the custom interface is I/O-mapped, pro- 
tection of the interface can be provided with the 
80376 10PL (I/O Privilege Level) mechanism. 


The 80387SX numeric coprocessor interface is I/O 
mapped as shown in Table 4.8. Note that the 
80387SX coprocessor interface addresses are be- 
yond the OH-OFFFFHrange for programmed I/O. 
When the 80376 supports the 80387SX coproces- 
sor, the 80376 automatically generates bus cycles to 
the coprocessor interface addresses. 


Table 4.8 Numeric Coprocessor 
Port Addresses 


Address in 80376 
80387SX 
1/0 Space 
Coprocessor 
Register 


8000F8H 
Opcode Register 
8000FCH 
Operand Register 
8000FEH 
Operand Register 


SOFTWARE TESTING FOR COPROCESSOR 
PRESENCE 


When 
software 
is 
used 
to 
test 
coprocessor 


(80387SX) presence, it should use only the following 
coprocessor 
opcodes: 
FNINIT, 
FNSTCW 
and 


FNSTSW. To use other coprocessor opcodes when 
a coprocessor is known to be not present, first set 
EM = 1 in the 80376 CROregister. 


5.0 PACKAGE THERMAL 


SPECIFICATIONS 


The Intel 80376 embedded processor is specified 
for operation when case temperature is within the 
range of 0·C-115°C 
for the ceramic 88-pin PGA 


package, and 0°C-110°C 
for the 100-pin plastic 


package. The case temperature may be measured 
in any environment, to determine whether the 80376 
is within specified operating range. The case tem- 
perature should be measured at the center of the 
top surface. 
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The ambient temperature is guaranteed as long as 
Te is not violated. The ambient temperature can be 
calculated from the Bje and Bja from the following 
equations: 


TJ = Tc + pOIlJc 


TA = Tj - 
POllia 


Tc = Ta + PO[llja 
- 
Ilid 


Values for Bjaand Bjeare given in Table 5.1 for the 
100-lead fine pitch. 0ja is given at various airflows. 
Table 5.2 shows the maximum Ta allowable (without 
exceeding Tcl at various airflows. Note that Ta can 
be improved further by attaching "fins" or a "heat 
sink" to the package. P is calculated using the maxi- 
mum hot Ice. 


Table 5.1. 80376 Package Thermal 
Characteristics 
Thermal Resistances 
("C/Watt) 
0IC and 0la 


0la Versus Airflow-ft/mln 
(m/sec) 


Package °IC 0 
200 
400 
600 
800 
1000 
(0) (1.01) (2.03) (3.04) (4.06) (5.07) 


100-Lead 
7 
33 
27 
24 
21 
18 
17 
Fine Pitch 


88-Pin 
2 
25 
20 
17 
14 
12 
11 
PGA 


Assuming Ice hot of 360 mA, Vee of 5.0V, and a 
TeASEof 11OOGfor plastic and 115°Gfor the 88-Pin 
PGA Package: 


Table 5.2. 80376 
Maximum Allowable Ambient 
Temperature at Various Airflows 


TA{"C) vs Airflow-ft/min 
(m/sec) 
Package 
°IC 
0 
200 
400 
600 
800 
1000 
(0) (1.01) (2.03) (3.04) (4.06) (5.07) 
100-Lead 
7 
63 
74 
79 
85 
91 
92 
Fine Pitch 


88·Pin 
2 
74 
83 
88 
93 
97 
99 
PGA 


The following sections describe recommended elec· 
trical connections for the 80376, and its electrical 
specifications. 


6.1 Power and Grounding 


The 80376 is implemented in GHMOS III technology 
and has modest power requirements. However, its 
high clock frequency and 47 output buffers (address, 
data, control, and HLDA) can cause power surges 
as multiple output buffers drive new signal levels 
simultaneously. For clean on-chip power distribution 
at high frequency, 14 Vee and 18 VSSpins separate- 
ly feed functional units of the 80376. 


Power and ground connections must be made to all 
external Vee and GND pins of the 80376. On the 
circuit board, all Vee pins should be connected on a 
Vee plane and all VSSpins should be connected on 
a GND plane. 


Liberal decoupling capacitors should be placed near 
the 80376. The 80376 driving its 24-bit address bus 
and 16-bit data bus at high frequencies can cause 
transient power surges, particularly when driving 
large capacitive loads. Low inductance capacitors 
and interconnects are recommended for best high 
frequency electrical performance. Inductance can 
be reduced by shortening circuit board traces be- 
tween the 80376 and decoupling capacitors as 
much as possible. 


The ERROR# and BUSY# inputs have internal pull· 
up resistors of approximately 20 KO and the PEREQ 
input has an internal pull-down resistor of approxi- 
mately 20 KO built into the 80376 to keep these 
signals inactive when the 80387SX is not present in 
the system (or temporarily removed from its socket). 


In 
typical 
designs, 
the 
external 
pull-up 
resistors 
shown 
in Table 
6.1 are recommended. 
However, 
a 
particular 
design 
may have reason 
to adjust the reo 
sistor values 
recommended 
here, or alter the use of 
pull-up 
resistors 
in other ways. 


Table 
6.1. Recommended 
Resistor 
Pull-Ups 
to Vcc 


Pin 
Signal 
Pull-Up 
Value 
Purpose 


16 
ADS# 
20 KO ± 10% 
Lightly Pull ADS# 
Inactive during 80376 
Hold Acknowledge 
States 


26 
LOCK# 
20 KO ± 10% 
Lightly Pull LOCK # 
Inactive during 80376 
Hold Acknowledge 
States 


For reliable 
operation, 
always 
connect 
unused 
in- 
puts to an appropriate 
signal level. N/C 
pins should 
always 
remain 
unconnected. 
Connection 
of 
N/C 
pins 
to Vcc 
or Vss 
will 
result 
In incompatibility 
with 
future 
stepplngs 
of the 80376. 


Particularly 
when not using interrupts 
or bus hold (as 
when first prototyping), 
prevent 
any chance 
of spuri- 
ous activity by connecting 
these associated 
inputs to 
GND: 


-INTR 
-NMI 
-HOLD 


If not using address 
pipelining 
connect 
the NA# 
pin 
to a pull-up 
resistor 
in the range of 20 KO to Vee. 


6.2 Absolute 
Maximum Ratings 


Table 
6.2. Maximum 
Ratings 


Parameter 
Maximum 
Rating 


Storage Temperature 
- 65°C to + 150°C 


Case Temperature 
- 65°C to + 120°C 
under Bias 


Supply Voltage with 
- 0.5V to + 6.5V 
Respect 
to Vss 


Voltage on Other Pins 
-0.5V 
to (Vee + 0.5)V 


Table 6.2 gives a stress ratings 
only, and functional 
operation 
at the maximums 
is not guaranteed. 
Func- 
tional operating 
conditions 
are given in Section 
6.3, 
D.C. Specifications, 
and Section 
6.4, A.C. Specifi- 
cations. 


Extended 
exposure 
to the Maximum 
Ratings may af- 


fect 
device 
reliability. 
Furthermore, 
although 
the 
80376 contains 
protective 
circuitry 
to resist damage 
from 
static 
electric 
discharge, 
always 
take 
precau- 
tions to avoid high static voltages 
or electric 
fields. 
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ADVANCE 
INFORMATION 
SUBJECT 
TO CHANGE 
Table 6.3: 80376 D.C. Characteristics 
Functional 
Operating 
Range: VCC = 5V ± 10%; TCASE = O°C to 115°C aa-pin PGA, TCASE = O°C to 110°C 
100-pin 
plastic 


VIL 


VIH 


VILC 


VIHC 


VOL 


IOL = 4mA: 


IOL = 5mA: 


VOH 


IOH = -1 
mA: 


IOH = -0.2 
mA: 


CIN 


COUT 


CCLK 


Parameter 


Input LOW Voltage 


Input HIGH Voltage 


CLK21nput 
LOW Voltage 


CLK2 Input HIGH Voltage 


Output LOW Voltage 


A23-Al,01S-00 


BHE#, 
BLE#, 
W/R#, 


O/C#, 
MIIO#, 
LOCK#, 
AOS#, 
HLOA 


Output High Voltage 


A23-Al,01S-00 


Min 
-0.3 


2.0 
-0.3 


VCC - 
o.a 


age Current 
QPin) 


Input Leakage Current 
(Busy# 
and ERROR# 
Pins) 


Output 
Leakage Current 


Supply Current 
at HOT 


Input Capacitance 


Output or 1/0 Capacitance 


CLK2 Capacitance 


±15 
p.A,OV ,,; VIN ,,; VCC<l) 


200 
p.A, VIH = 2,4V(l, 
2) 


-400 
p.A, VIL = 0,45V(3) 


±15 
p.A, 0,45V ,,; VOUT ,,; VCC<l) 


400 
mA(4) 


360 
mA(6) 


10 
pF, Fc = 1 MHz(S) 


12 
pF, Fc = 1 MHz(S) 


20 
pF, Fc = 1 MHz(S) 


NOTES: 
1. Tested at the minimum operating frequency of the part. 
2. PEREa input has an internal pull-down resistor. 
3. BUSY# and ERROR# inputs each have an internal pull-up resistor. 
4. Ice max measurement at worse case frequency, Vee and temperature (O°C). 
5. Not 100% tested. 
6. Ice HOT max measurement at worse case frequency, Vee and max temperature. 
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The A.C. specifications given in Table 6.4 consist of 
output delays, input setup requirements and input 
hold requirements. All AC. specifications are rela- 
tive to the CLK2 rising edge crossing the 2.0V level. 


AC. specification measurement is defined by Figure 
6.1. Inputs must be driven to the voltage levels indi- 
cated by Figure 6.1 when AC. specifications are 
measured. 80376 output delays are specified with 
minimum and maximum limits measured as shown. 
The minimum 80376 delay times are hold times pro- 
vided to external circuitry. 80376 input setup and 
hold times are specified as minimums, defining the 


smallest acceptable sampling window. Within the 
sampling window, a synchronous input signal must 
be stable for correct 80376 processor operation. 


Outputs NA#, 
W/R#, 
O/C#, 
MIIO#, 
LOCK#, 
BHE#, BLE#, A23-A1 and HLOA only change at 
the beginning of phase one. 015-00 
(write cycles) 
only change at the beginning of phase two. The 
REAOY#, HOLO, BUSY#, ERROR#, PEREQ and 
015-00 (read cycles) inputs are sampled at the be- 
ginning of phase one. The NA#, INTR and NMI in- 
puts are sampled at the beginning of phase two. 


CLK2[ 


® 
t.4IN 
OUTPUTS 


(A l-A23,BHE#,BLE#. 
[ 
AOS#,t.4/IO#.O/C#. 


W/R#.LOCK#,HLOA) 


OUTPUTS [ 
(00-015) 


INPUTS [ 
(NA#.INTR.Nt.4I) 


INPUTS 
(REAOY#,HOLO. [ 
ERROR#.BUSY#. 
PEREQ,OO-015) 


LEGEND: 
A-Maximum 
Output 
Delay 
Spec. 
S-Minimum 
Output 
Delay 
Spec. 
C-Minimum 
Input Setup 
Spec. 
D-Minimum 
Input 
Hold Spec. 


t.4AX 


1.5V ou~:~~On+1 
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ADVANCE 
INFORMATION 
SUBJECT 
TO CHANGE 


Table 
6.4. 80376 A.C. Characteristics 
at 16 MHz 
Functional 
Operating 
Range: 
Vee = 5V ±10%; 
TeASE = O°C to 115°C for 88-pin 
PGA, O°C to 110°C for 
1OO-pin plastic 


Symbol 
Parameter 
Mln 
Max 
Unit 
Figure 
Notes 


Operating 
Frequency 
4 
16 
MHz 
Half CLK2 Freq 


tl 
CLK2 Period 
31 
125 
ns 
6.3 
:1.- 
I' 


t2a 
CLK2 HIGH Time 
9 
ns 
6.~At2(3) 


t2b 
CLK2 HIGH Time 
5 
ns 
A~I 
A!A- 
0.8)V(3) 


t3a 
CLK2 LOW Time 
9 
ns ..••~'V3l~) 


t3b 
CLK2 LOW Time 
7 
••••• 


•• 
6.3 ~ 
0.8V(3) 


t4 
CLK2 Fall Time 
, 
,£ 


4iU 
(Vee-0.8)V 
to 0.8V(3) 


t5 
CLK2 Rise Time 
G!I~ 
0.8V to (Vee-0.8)(3) 


t6 
A23-Al 
Valid Delay 
4 
~-fl-'- CL = 120 pF(4) 


t7 
A23-Al 
Float Delay 
~ 


40 
" 
~ 
(1) 


ta 
BHE#, 
BLE#, 
LOCK# 
~~~ 
'~~ 


.•.•••• 
CL = 75 pF(4) 
Valid Delay 
~~ 
~ 
.5 


t9 
BHE#,BLE#,LOCK# 
~~ 
4~ 


~ 


6.6 
(1) 


Float Delay 


tlO 
W/R#, 
M/IO#, 
D~""".~ 
~~ 


ns 
6.5 
CL = 75 pF(4) 
ADS# 
Valid Del~ 


tll 
W/R#,~~Y;~~ 
~ 
•••••••35 
ns 
6.6 
(1) 


ADS# 
FI 
Y 


t12 
D15~r'% 
.••, 
.• 
~~-4 
40 
ns 
6.5 
CL = 120 pF(4) 


Vlliid 
Y 


t13 
D;5-DO~ 
4 
35 
ns 
6.6 
(1) 


Float De 


t14 
HLDA Vali 
elay 
6 
33 
ns 
6.6 
CL = 75 pF(4) 


t15 
NA # Setup Time 
5 
ns 
6.4 


t16 
NA # Hold Time 
21 
ns 
6.6 


t19 
READY # Setup Time 
19 
ns 
6.4 


t20 
READY# 
Hold Time 
4 
ns 
6.4 


t21 
Setup Time 015-00 
Read Data 
9 
ns 
6.4 


t22 
Hold Time 015-00 
Read Data 
6 
ns 
6.4 


t23 
HOLD Setup Time 
26 
ns 
6.4 


t24 
HOLD Hold Time 
5 
ns 
6.4 


t25 
RESET Setup Time 
13 
ns 
6.7 


t26 
RESET Hold Time 
4 
ns 
6.7 


NOTE: 
The a0376 does not have t17 or t18 timing specifications. 
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Table 6.4. 80376 A.C. Characteristics at 16 MHz 
Functional Operating Range: Vcc = 5V ± 10%; TCASE 
O°Cto 115°C for SO-pinPGA, O°Cto 110°C for 
100-pin plastic (Continued) 


Symbol 
Parameter 


t27 
NMI, INTR Setup Time 


t28 
NMI, INTR Hold Time 


t29 
PEREQ, ERROR#, BUSY# 
Setup Time 


PEREQ,ERROR#,BUSY# 
Hold Time 


Figure 


6.4 


6.4 


6.4 


NOTES: 
1. Float condition occurs when maximu 
u pu 
Em\ 
ecomes less than ILO in magnitude. Float delay is not 100% 


tested. 
~ 
~ 


2. These inputs are allowed to be asynchro~ 
" to CLK2. The setup and hold specifications are given for testing purposes, 


to assure recognition within a specific CL~ 
riod. 


3. These are not tested. They are guaranteed by design characterization. 
4. Tested with Cl set to 50 pF and derated to support the indicated distributed capacitive load. See Figure 6.8 for the 
capacitive derating curve. 


80376 


OUTPUT~ 


~CL; 
50pF 


BUSV#. 
[ 
ERROR# 
PEREQ 


BHE H. BLE H. 
[ 
LOCKH 


00-015 
[ 
(OUTPUT) 


CLK2 [ 


BHEH. 
BLEH. [ 
LOCKH 


WIRH. 
M/IOH. [ 
D/CH.ADSH 


Al-A23 [ 


00-015 
[ 
(HIGH Z) 


@ALSO 
APPLIES TO DATA FLOAT WHEN WRITE 
CYCLE IS FOLLOWED BY READ OR IDLE 
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240182-41 
The second 
internal 
processor 
phase following 
RESET 
high-to-Iow 
transition 
(provided 
t25 and t26 are met) is <1>2. 


Typical Slew Rates 
at CMOS Levels 
30 
-;;- 
..:. 
20 
5 
~ 
15 
o::l 
~ 
10 


~ 
o 


25 
50 
75 
100 125 
150 


CAPACITIVE 
LOAO (pF) 


Figure 6.9. CMOS Level Slew 
Rates for Output Buffers 


Typical Slew Rates at TTL Levels 
!p.SV to 2.0V and 2.0V to O.SV) 


Figure 6.10. TTL Level Slew 
Rates for Output Buffers 
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6.5 
Designing for ICETM·376 Emulator 
(Advanced 
Data) 


The 376 embedded processor in-circuit emulator 
product is the ICE·376 emulator. Use of the emula- 
tor requires the target system to provide a socket 
that is compatible with the ICE-376 emulator. The 
80376 offers two different probes for emulating user 
systems: an 88-pin PGA probe and a 100-pin fine 
pitch flat-pack probe. The 100-pin fine pitch flat- 
pack probe requires a socket, called the 100-pin 
PQFP, which is available from 3-M text-tool (part 
number 2-0100-07243-000). The ICE·376 emulator 
probe attaches to the target system via an adapter 
which replaces the 80376 component in the target 
system. Because of the high operating frequency of 
80376 systems and of the ICE-376emulator, there is 
no buffering between the 80376 emulation proces- 
sor in the ICE-376 emulator probe and the target 
system. A direct result of the non-buffered intercon- 
nect is that the ICE-376 emulator shares the ad- 
dress and data bus with the user's system, and the 
RESET signal is intercepted by the ICE emulator 
hardware. In order for the ICE-376 emulator to be 
functional in the user's system without the Optional 
Isolation Board (OIB) the designer must be aware of 
the following conditions: 
1. The bus controller must only enable data trans- 
ceivers onto the data bus during valid read cycles 
of the 80376, other local devices or other bus 
masters. 
2. Before another bus master drives the local proc- 
essor address bus, the other master must gain 
control of the address bus by asserting HOLD and 
receiving the HLDA response. 


3. The emulation processor receives the RESET sig- 
nal 2 or 4 CLK2 cycles later than an 80376 would, 
and responds to RESET later. Correct phase of 
the response is guaranteed. 


In addition to the above considerations, the ICE-376 
emulator processor module has several electrical 
and mechanical characteristics that should be taken 
into consideration when designing the 80376 sys- 
tem. 


Capacitive 
Loading: 
ICE-376 adds up to 27 pF to 
each 80376 signal. 


Drive Requirements: 
ICE-376 adds one FAST TIL 
load on the CLK2, control, address, and data lines. 
These loads are within the processor module and 
are driven by the 80376 emulation processor, which 
has standard drive and loading capability listed in 
Tables 6.3 and 6.4. 


Power 
Requirements: 
For noise immunity and 
CMOS latch-up protection the ICE-376 emulator 
processor module is powered by the user system. 
The circuitry on the processor module draws up to 
1.4A including the maximum 80376 Ice from the 
user 80376 socket. 


80376 Location 
and Orientation: 
The ICE-376 em- 
ulator processor module may require lateral clear- 
ance. Figure 6.12 shows the clearance requirements 
of the iMP adapter and Figure 6.13 shows the clear- 
ance requirements of the 88-pin PGA adapter. The 


D 


o 


00 
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S~~=======IL 
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~ 
lJ====t 
e==n, 


1.25" 


Figure 6.12. Preliminary 
ICETM·376 Emulator 
User Cable with PQFP Adapter 
4-100 
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optional 
isolation 
board 
(018), which 
provides 
extra 
electrical 
buffering 
and has the same 
lateral 
clear- 
ance 
requirements 
as Figures 
6.12 and 6.13, adds 
an additional 
0.5 inches to the vertical 
clearance 
re- 


quirement. 
This is illustrated 
in Figure 6.14. 


on the user's bus. The 018 allows the ICE·376 emu- 
lator to function 
in user systems 
with faults (shorted 
signals, 
etc.). 
After 
electrical 
verification 
the 
018 
may be removed. 
When the 018 is installed, 
the user 
system must have a maximum 
CLK2 frequency 
of 20 
MHz. 


Optional Isolation Board (OIB) and the CLK2 
speed reduction: Due to the unbuffered 
probe de- 


sign, the 
ICE-376 
emulator 
is susceptible 
to errors 


o 


1-'----------------22.0 
.. 
6§ 
'I 
,a::::D. 


240182-51 
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7.0 
DIFFERENCES 
BETWEEN 
THE 
80376 AND THE 80386 


The following are the major differences between the 
80376 and the 80386. 
1. The 80376 generates byte selects on BHE# and 
BLE# (like the 8086 and 80286 microprocessors) 
to distinguish the upper and lower bytes on its 
16-bit data bus. The 80386 uses four-byte selects, 
BEO# -BE3 #, to distinguish between the differ- 
ent bytes on its 32-bit bus. 
2. The 80376 has no bus sizing option. The 80386 
can select between either a 32-bit bus or a 16-bit 
bus by use of the BS16# input. The 80376 has a 
16-bit bus size. 
3. The NA# pin operation in the 80376 is identical to 
that of the NA# pin on the 80386 with one excep- 
tion: the NA# pin of the 80386 cannot be activat- 
ed on 16-bit bus cycles (where BS16# is LOW in 
the 80386 case), whereas NA# can be activated 
on any 80376 bus cycle. 
4. The contents of all 80376 registers at reset are 
identical to the contents of the 80386 registers at 
reset, except the OX register. The OX register 
contains a component-stepping identifier at reset, 
I.e. 


in 80386, after reset OH = 3 indicates 80386 
OL = revision number; 
in 80376, after reset OH = 33H indicates 80376 
OL = revision number. 


5. The 80386 uses A31 and M/IO# 
as a select 
for numerics coprocessor. The 80376 uses the 
A23and M/IO# to select its numerics coproc- 
essor. 


6. The 80386 prefetch unit fetches code in four- 
byte units. The 80376 prefetch unit reads two 


bytes as one unit (like the 80286 microproces- 
sor). In BS16# mode, the 80386 takes two con- 
secutive bus cy.cles to complete a prefetch re- 
quest. If there is a data read or write request after 
the prefetch starts, the 80386 will fetch all four 
bytes before addressing the new request. 


7. The 80376 has no paging mechanism. 


8. The 80376 starts executing code in what corre- 
sponds to the 80386 protected mode. The 80386 
starts execution in real mode, which is then used 
to enter protected mode. 


9. The 80386 has a virtual-86 mode that allows the 
execution of a real mode 8086 program as a task 
in protected mode. The 80376 has no virtual-86 
mode. 
10. The 80386 maps a 48·bit logical address into a 
32-bit physical address by segmentation and 
paging. The 80376 maps its 48-bit logical ad- 
dress into a 24-bit physical address by segmen- 
tation only. 
11. The 80376 uses the 80387SX numerics coproc- 
essor for floating point operations, while the 
80386 uses the 80387 coprocessor. 


12. The 80386 can execute from 16-bit code seg- 
ments. The 80376 can only execute from 32-bit 
code Segments. 


This section describes the 376 embedded processor 
instruction set. Table 8.1 lists all instructions along 
with 
instruction 
encoding 
diagrams 
and 
clock 
counts. Further details of the instruction encoding 
are then provided in the following sections, which 
completely describe the encoding structure and the 
definition of all fields occurring within 80376 instruc- 
tions. 


8.1 80376 Instruction 
Encoding and 
Clock Count Summary 


To calculate elapsed time for an instruction, multiply 
the instruction clock count, as listed in Table 8.1 be- 
low, by the processor clock period (e.g. 62.5 ns for 
an 80376 operating at 16 MHz). The actual clock 
count of an 80376 program will average 10% more 


than the calculated clock count due to instruction 
sequences which execute faster than they can be 
fetched from memory. 


Instruction 
Clock Count Assumptions: 
1. The instruction has been prefetched, decoded, 
and is ready for execution. 


2. Bus cycles do not require wait states. 
3. There are no local bus HOLD requests delaying 
processor acess to the bus. 


4. No exceptions are detected during instruction ex- 
ecution. 


5. If an effective address is calculated, it does not 
use two general register components. One regis- 
ter, scaling and displacement can be used within 
the clock counts showns. However, if the effec- 
tive address calculation uses two general register 
components, add 1 clock to the clock count 
shown. 
6. Memory reference instruction accesses byte or 
aligned 16-bit operands. 


Instruction 
Clock Count 
Notation 
- 
If two clock counts are given, the smaller refers to 
a register operand and the larger refers to a 
memory operand. 


-n 
= number of times repeated. 


-m 
= number of components in the next instruc- 
tion executed, where the entire displacement (if 
any) counts as one component, the entire im- 
mediate data (if any) counts as one component, 
and all other bytes of the instruction and pre- 
fix(es) each count as one component. 


Misaligned 
or 32-Slt 
Operand 
Accesses: 
- 
If instructions accesses a misaligned 16-bit oper- 
and or 32-bit operand on even address add: 
2' 
clocks for read or write. 
4" 
clocks for read and write. 
- 
If instructions accesses a 32-bit operand on odd 
address add: 
4' 
clocks for read or write. 


8" 
clocks for read and write. 


Wait states add 1 clock per wait state to instruction 
execution for each data access. 


intJ 


GENERAL 
DATA TRANSFER 


MOV = Move: 


Register to Register/Memory 


Register from RegisterlMemory 


PUSH = Pu.h: 


ReglsterlMemory 


Register (Short Form) 


Segment Register (ES, CS, SS or OS) 


1000100w 
I mod reg 
rim I 


1000101w 
mod reg 
rim I 


1100011 
w 
modOOO 
r/ml 
immediate 
data 


1011 
w 
reg 
immediate 
data 


1010000w 
full displacement 


1010001w 
full displacement 


10001110 
modsreg3 
rim I 


10001100 
I modsreg3 
rim I 


2/4' * 


2 


XCHG - 
Exchenge 


ReglsterlMemory 
with Register 
I 
1 00001 
1 w I mod reg 
rim I 


Register with Accumulator 
(Short Form) 
11 0 0 1 0 
reg I 


IN - 
Input Irom: 


Number 
01 Date 
Cycle. 


inter 


Clock 
Number 
olDala 
Noles 
Counts 
Cycles 


26' 
6' 
a, b, C 


26' 
6' 
a, b, C 
:/..- 
2 ' 
6' 
a, b, C 


6' 
a, b, C 


S, b, C 


11000101 
I mod reg 
rim I 


11000100 
I mod reg 
rim I 


00001111 
10110100 
I mod reg 
rim I 


00001111 
10110101 
I mod reg 
rim I 


00001111 
10110010 
I mod reg 
rim I 


11111000 


11111100 


PUSHF = Push Flags 


SAHF ~ Slore AH Inlo Flags 


Register to Register * 
Register to Memory 
7" 
2" 
a 


Memory to Register 
0000001w 
6' 
l' 
a 


Immediate to Register/Memory 
100000sw 
immediate data 
217" 
0/2" 
a 


Immediate to Accumulator 
(Short Form) 
0000010w 
immediate data 
2 


ADC - 
Add wllh Carry 


Register to Register 
000100dw 
I mod reg 
rim I 


Register to Memory 
0001000w 
I mad rag 
rim I 
7" 
2" 


Memory to Register 
0001001w 
I mad rag 
rim I 
6' 
l' 


Immediate to Register/Memory 
100000sw 
Imod010 
rim I immediate data 
2/7" 
0/2" 
a 


Immediate to Accumulator 
(Short Form) 
0001010w 
immediate data 


INe = Increment 


Register/Memory 
I 
l111111w 
I modOOO 
rim I 
2/6" 
0/2" 
a 


Register (Short Form) 
101000 
reg I 


SUB = Sublracl 


Register from Register 
001010dw 
I mod reg 
rim I 


inter 


Immediate 
from Register/Memory 


immediate 
from Accumulator 
(Short Form) 


Accumulator 
with Register/Memory 


Multiplier-Byte 


-Word 


-Doubleword 


IMUL = Integer Multiply (Signed) 


Accumulator 
with Register IMemory 


Multiplier-Byte 


-Word 
-Daubleword 


Clock 
Number 
Formet 
Counts 
OlDeta 
Note. 
Cycle. 


100101 
OOw Imodreg 
r/ml 
7" 
2" 
a 


10010101 
w !modreg 
r/ml 
6' 
a 


!100000sw 
!mod101 
rIm I immediate 
data 
* 
0/1" 
a 


j0010l10wl 
immediate 
data 


00011 
Odw 
Imodreg 
r/ml 


00011 
OOw Imodreg 
r/ml 
2" 
a 


0001 
101w 
lmodreg 
r/ml 
l' 


1 OOOOOsw 
Imod01 
1 
0/2" 
a 


0001110W! 


0/2" 
e 


12-17/15-20 
0/1 
e,n 
12-25/15-28' 
0/1' 
e,n 
12-41/17-46' 
0/2' 
e,n 


12-17115-20 
Oil 
e,n 
12-25/15-28' 
011' 
e,n 
12-41/17-46' 
0/2' 
e,n 


12-17115-20 
0/1 
e,n 
12-25/15-28' 
011' 
e,n 
12-41/17-46' 
0/2' 
e,n 


13-26/14-27' 
011' 
e,n 
13-42/16-45' 
0/2' 
a,n 


Register 
with Register/Memory 


Multiplier-Byte 


-Word 


-Doubleword 


Register/Memory with Immediate to Register I 011 
01 0 s 1 I mod reg 
r/ml 
immediate data 


intJ 


Table 8.1.110376Instruction 
Set Clock Count Summary (Continued) 


Clock 
Number 
Instruction 
Format 
Counts 
OtDete 
Notes 
Cycles 


ARITHMETIC (Continued) 


DIY ~ Divide (Unsigned) 


Accumulator 
by Register/Memory 
1111011w 
ImOdl10 
r/ml 


Drvisor-Byte 
0/1 
a,O 


-Word 
0/1' 
a,o 


-Doubleword 
0/2' 
8,0 


IDlY ~ Integer Divide (Signed) 


Accumulator by Register/Memory 
1 11 1 01 1 w Jmod 1 1 1 
r/ml 


Divisor-Byle 
0/1 
a,o 


-Word 
0/1 
a,o 


-Doubleword 
0/2' 
a,o 


AAD ~ ASCII Adjust lor Divide 
11010101 
00001010 
19 


AAM ~ ASCII Adjust lor Multiply 
11010100 
17 


CBW ~ Convert By1e to Word 
10011000 


CWO = Convert Word to Double Word I 10011001 


LOGIC 


Register/Memory by 1 
3/7" 
0/2" 
a 


Register/Memory by CL 
317" 
0/2" 
a 


317" 
0/2" 
a 


9/10" 
0/2·· 
a 


9/10" 
10/2" 
a 


9/10" 
0/2" 
a 


TTT 
Instruction 
000 
ROL 


001 
ROR 


010 
RCL 


011 
RCR 


100 
SHLISAL 


101 
SHR 


111 
SAR 
SHLD = Shift Left Double 


Register/Memory 
by Immediate 
00001111 
10100100 
Imod reg 
r/mlimmed 
8-bit data 
3/7" 
0/2" 


Register/Memory by CL 
00001111 
10100101 
Imod reg 
r/ml 
317·· 
0/2" 


SHRD = Shift Right Doubla 


Register/Memory 
by Immediate 
00001111 
10101100 
Imod reg 
r/m!immed 
8-bit data 
317" 
0/2" 


Register/Memory by CL 
00001111 
10101101 
Imod reg 
r/ml 
317" 
0/2" 


AND = And 


Register to Register 
I 001000dw 
Imod reg 
r/ml 
2 


inter 


Table 8.1. 80376 Instruction 
Set Clock Count Summary (Continued) 


Clock 
Number 
Instruction 
Format 
Count. 
olOata 
Notea 
Cyclea 


LOGIC 
(Continued) 
Imod reg 
r/ml 
:/- 


Register to Memory 
0010000w 
'f'" 
2" 
a 


Memory to Register 
0010001w 
!mOdreg 
r/ml 
l' 


Immediate to Register/Memory 
1000000w 
ImOd100 
r/ mI immediate data 
0/2" 


Immediate to Accumulator 
(Short Form) 
0010010w 
I immediate data 


TEST - 
And Function 
to Flags, No Result 


Register/Memory 
and Register 
I 1000010w 
I mad res 
r/ml 
01" 


Immediate Oats and Register/Memory 
I 1111011 
w 
ImOdOOO 
2/5' 
011' 
a 


Immediate Data and Accumulator 


(Short Form) 


OR - 
Or 


Register to Register 


Register to Memory 
7" 
2" 
a 


Memory to Register 
6' 
l' 
a 


Immediate to Register/Memory 
217" 
0/2" 
a 


Immediate to Accumulator 
(Short Form) 


XOR - 
Exclusive 
Or 


Register to Register 
2 


Register to Memory 
7" 
2" 
a 


Memory to Register 
6' 
l' 


Immediate to Register 
rIm 
immediate data 
217" 
0/2" 
a 


Immediate to Accumulator 
(Short 
immediate data 
2 


NOT = Invert 
Reglster/Memory 
1111011 
w 
Imado 
1 0 
r/ml 
2/6" 
0/2" 


STRING MANIPULATION 


CMPS - 
Compare 
Byte Word 
1010011 
w I 
10' 
2' 


INS - 
Input BytelWord 
Irom OX Port 
0110110w 
9" 
1" 
a,f,k 
29·· 
1" 
a,l,1 


LOOS = Load BytelWord 
to ALl AX/EAX I 1010110w 
5' 
l' 


MOVS = Move Byte Word 
1010010w 
7" 
2" 


OUTS - 
Output 
Byte/Word 
to OX Port 
0110111 
w 
8" 
1" 
a,f,k 
28" 
1" 
a,f,1 


SCAS = SCan Byte Word 
1010111 
w 
7' 
l' 


S.TOS - 
Store By tel Word Irom 


AL/AX/EX 
'01010' 
w 
4' 
" 
a 


XLAT 
- 
Trans'ate 
String 
11010111 
5' 
l' 
a 


REPEATED 
STRING MANIPULATION 


Repeated by Count in CX or ECX 


REPE CMPS - 
Compere 
String 


(Find Non-Match) 
11110011 
1010011 
w 
5 + 9nu 
2"·· 


inter 
80376 
~IIDWb:\lM©~OOOIP@OOlMl~iiO@OO 


Table 8.1. 80376 Instruction 
Set Clock Count Summary 
(Continued) 


Clock 
Number 


Instruction 
Format 
Counts 
01 Oats 
Notes 
Cycles 


REPEATED STRING MANIPULATION 
(Continued) 


REPNE CMPS = Compare String 


(Find Match) 
11110010 
1010011 
w 
5 + 90-· 
2n" 
e 


REP INS = Input String 
0110110w 
7~n' 
In' 
a,l,k 
11110011 


27 
6n' 
In' 
a,f,1 


REP LODS = Load String 
11110011 
1010110w 
6n' 
In' 
a 


REP MOVS - 
Move String 
11110011 
1010010w 
I 
20-· 
a 


REP OUTS = Output String 
11110011 
0110111 
w I 
In' 
a,l,k 
In' 
a,I,1 


REPE SCAS = scan String 


(Find Non-All AX/EAX) 
11110011 
1010111 
w I 
In' 
a 


REPNE SCAS = scan String 


(FindAL/ AX/EAX) 
11110010 
1010111 
w 
In' 
a 


REP STOS = Store String 
11110011 
In' 
a 


BIT MANIPULATION 


BSF = scan Bit Forward 
10 + 30·· 
2n·· 
a 


BSR - 
scan Bit Reverse 
10 + 30·· 
20·· 
a 


BT = Test Bit 


Register/Memory, Immediate 
3/6' 
0/1' 
a 


Register/Memory, Register 
3/12' 
0/1' 
a 


BTC - 
Test Bit and Complement 


Register/Memory, 
Immediate 
6/S' 
0/2' 


6/13' 
0/2' 
a 


6/S' 
0/2' 
a 


Register/Memory, Register 
6/13' 
0/2' 
a 


BTS - 
Test Bit and set 


Register/Memory, 
Immediate 
6/S' 
0/2' 
a 


Register/Memory, Register 
00001111 
10101011 
6/13' 
0/2' 
a 


CONTROL TRANSFER 


CALL = Call 


Direct within Segment 
11101000 
I full displacement 
9 + m· 


Register/Memory 


Indirect within Segment 
11111111 
ImodOl0 
r/ml 
9 + m/12 + m 
2/3 
a,j 


Direct 
Intersegment 
10011010 
I unsigned 
full offset. selector 
42 + m 
9 
c,d,1 


inter 


Table 8.1. 80376 Instruction 
Set Clock Count Summary (Continued) 


Clock 
Number 


Instruction 
Format 
Counts 
oloala 
Nolea 
Cyclea 


CONTROL TRANSFER 
(Conlinued) 


(DIrect Intersegment) 
:b 


Via Call Gate to Same Privilege Level 
64 ~ m 
13 
a,c,d,j 


Via Call Gate to Different Privilege Level, 


(No Parameters) 
13 
a,c,d,j 


Via Call Gate to Different Privilege Level, 


(x Parameters) 
13 + 4x 
a,c,d,i 


From 386 Task to 386 TSS 
124 
a,c,d,j 


Indirect Intersegment 
11111111 
ImodO 
11 
r/ml 
10 
a,c,d,j 


Via Call Gate to Same Privilege Level 
14 
a,c,d,i 


Via Call Gate to Different Privilege Level, 


(No Parameters) 
14 
a,c,d,j 
Via Call Gate to Different Privilege Level, 


(x Parameters) 
14 + 4x 
a,c,d,i 


From 386 Task to 386 TSS 
399 
130 
a.c,d,j 


JMP - 
Unconditional 
Jump 


Short 
7 + m 


Direct within Segment 
7 + m 


9 + m/14 
+ m 
2/4 
a,1 


Direct Intersegment 
37 + m 
c,d,j 


Via Call Gate to Sam~riYil 
53 + m 
9 
a,c,d,j 


From 386 Task to 386 TSS 
395 
124 
a,c,d,j 


Indirect Intersegment 
I mod 1 0 1 
r/ml 
37 + m 
a,c,d,j 


Via Call Gate to Same Privilege Level 
59 + m 
13 
a,c,d,j 


From 386 Task to 386 TSS 
401 
124 
a,c,d,j 


inter 


Number 
of Data 
Cycles 


"0000'0 


, '00' 
0" 
] 


"00'0'0 


a,i,p 


a,i,p 


a,c,d,i,p 


4 
a,c,d,i,p 


4 
c,d,j,p 


4 
c,d,j,p 


to Different 
Privilege Level 


Intersegment 


Intersegment 
Adding Immediate 
to SP 


CONDITIONAL 
JUMPS 


NOTE: 
Times Are Jump "Taken 
or Not Taken" 


JO ~ Jump on Overflow 


8·8it Displacement 


JNO - 
Jump on Not Overflow 


a-Bit Displacement 


JE/JZ 
~ Jump on E,,*,/Zer 


8·Bit Displacement 


JNE/JNZ 
~ Jump on Not EquallNot 
Zero 


8-Bit Displacement 
:1 =0=='='='=0='=0=='============= 


JBE/JNA 
~ Jump on Below 
or Equal/No _t_A_b_o_v_e 
_ 


8-Bit Displacement 
I"_0_'_'_'_0_'_'_0 
_ 


JNBE/JA 
~ Jump on Not Below or Equal/Above 


a·Bit Displacement 
=1 =0='=='=,=0='='='============== 
Full Displacement 


JS ~ Jump on Sign 


a-Bit Displacement 
[i", '000 


00001" 
, 


inter 


Number 
olDola 
Cycleo 


CONDITIONAL 
JUMPS 
(Continued) 


JNS = Jump on Not Sign 


8~BitDisplacement 
8-bitdispl 


1 0 0 0 1 0 0 1 I lull displacement 


JP/JPE 
- 
Jump on Parlty/Portty 
Even 


8-Bit Displacement 
I 
0 1 1 1 1 0 1 0 


Full Displacement 
I 
0000 
1 1 1 1 


JNP/JPO 
= Jump on Not Portty/Portty 
Odd 


8-B~ Displacement 
I 
0 1 1 1 1 0 1 1 


8-bitdispl 


1 0 0 0 1 0 1 0 I full displacement 


JNLE/JG 
= Jump on Not Le•• or 


8-Blt DIsplacement 
~ 


Full Displacement 
••• V 


JCXZ 
= Jump on Cx.:hro T 


CONDITIONAL 
BYTE SET 


NOTE: Times Are Register/Memory 


SETO = set Byte on Overflow 


To Register/Memory 


SETNO = set Byte on Not Overflow 


To Register/Memory 


SETB/SETNAE 
= set Byte on Below/Not 
Above 
or Equol 


To Register/Memory 
100001111 
110010010 
ImodOOO 
rim I 


Table 8.1. 80376 Instruction Set Clock Count Summary (Continued) 


Clock 
Number 


Inatruction 
Format 
Counts 
olData 
Notea 
Cyclea 


CONDITIONAL 
BYTE SET (Continued) 


SETNB = Set Byte on Not Belowl 
Above 
or Equal 


To Register/Memory 
I 
0000 
1111 
100 
1 00 11 I modOOO 
rIm I 
4/5' 
0/1' 


SETE/SETZ 
= Set Byte on Equal/Zero 
* 
To RegisterlMemory 
I 
00001111 
10010100 
I modOOO 
rIm I 
0/1' 


SETNEISETNZ 
~ Set Byte on Not EquIllNot 
Zero 


To Register/Memory 
I 
00001111 
10010101 
I modOOO 
11' 


0/1' 
I 


To Register/Memory 
0/1' 


SETS ~ Set Byte on Sign 


To Register/Memory 
0/1' 


SETNS '" Set Byte on Not Sign 


To Register/Memory 
0/1' 


To Register/Memory 
4/5' 
all' 
a 


4/5' 
all' 
a 


4/5' 
all' 
a 


4/5' 
all' 
a 


SETLE/SETNG 
~ SetSvtl 
0 


4/5' 
0/1' 


I modOOO 
rIm I 
4/5' 
0/1' 


ENTER = Enter Procldure 
16-bit displacement, 
8-bit level 


L-O 
10 


L - 
1 
14 
1 
L> 
1 
17 +8(n 
- 
1) 
4(n - 
1) 


LEAVE 
- 
Lelve 
Procedure 
1100 
100 
1 
a 


inter 


Number 
of Data 
Cycle. 


INT = Interrupt: 


Type Specified 


Via Interrupt or Trap Gate 


to Same Privilege Level 


Via Interrupt 
or Trap Gate 


to [);fferent 
Privilege 
Level 


Via Interrupt or Trap Gate 


to Same 
Privilege 
Level 


Via Interrupt or Trap Gate 


to Different 
Privilege 
Level 


Via Interrupt or Trap Ga 


to Same Privilege Lev 


Via Interrupt 
or Tr'JI:Gate 


to Different Privilege Level 


~". 
Gate 


inter 


In.tructlon 
Format 


INTERRUPT 
INSTRUCTIONS 
(Continued) 


Bound - 
Out of Ranga 
01100010 
mod reg 
rim 


Interrupt 
5 If Detoct Valua 


If In Range 


If Out of Range: 


Via Interrupt or Trap Gate 


to Same 
Privilege Level 


Via Interrupt or Trap Gate 


to Different 
Privilege 
Level 


From 386 Task to 386 TSS via Task Gate 


INTERRUPT 
RETURN 


IRET ~ Intarrupt Return 
11001111 


To the Same Privilege Level (within Task) 


To Different Privilege Level (within Task) 


From 386 Task to 386 TSS 


PROCESSOR CONTROL 


HLT ~ HALT 


eAO 


Register 
from CRO 


DAO-3 from Aegisti': 


DA6-7 
from Aegister 


Aegister from DA6-7 
00001111 
00100001 
l1eeereg 


Aegister from DAO-3 
00001111 
00100001 
11 eeereg 


NOP - 
No Operation 
10010000 


WAIT ~ Walt until BUSY # Pin I. Negatad I 
10011011 


Clock 
Number 


Counts 
Of Data 
Not.1 
Cycle. 


* 


0 
a,c,d,j,o,p 


14 
c,d,j,p 


14 
c,d,j,p 


138 
c,d,j,p 


42 
a,c,d,j,p 


86 
a,c,d,j,p 


328 
138 
c,d,j,p 


b 


10 


6 


22 


16 
b 


14 
b 


22 
b 


6 


enter 


Table 8.1. 80376 Instruction Set Clock Count Summary (Continued) 


Clock 
Number 
Inltructlon 
Format 
Counts 
ofoltl 
Notel 


Cyclel 


PROCESSOR EXTENSION INSTRUCTIONS 


Processor Extension Escape 
I 
11011 
TTT 
I modLLL 
rim I 
See 80387SX Data Sheet 
a 


TTT and LLL bits are opcode 


information for coprocessor. 
* 
PREFIX BYTES 


Addre •• Size Prefix 
01100111 


LOCK - 
BUI Lock Prefix 
11110000 


Operand Size Prefix 
01100110 


Slgment 
Override Prefix 


CS: 
00101110 


OS: 
00111110 


ES: 
00100110 


FS: 
01100100 


GS: 


55: 


PROTECTION CONTROL 


From Register/Memory 
a 


LAR = 
Load Acce •• Rlghtl 


From Register/Memory 
17/18' 
l' 
a,c,i,p 


13" 
3' 
a,e 


LloT ~ 


Table Register 
13" 
3' 
a,e 


LLoT = LaId 
LocII Descriptor 
rable Register to 
Imod010 
rim I 
Register/Memory 
00001111 
00000000 
24/28' 
5' 
a,c,9,p 


LMSW = LaId Mlchlne 
Stltul 
Word 


From Register/Memory 
00001111 
00000001 
I mod11 
0 
rim I 
10/13' 
l' 
a,e 


LSL = 
LaId segment 
Limit 


From Register/Memory 
00001111 
00000011 
I mad reg 
r/ml 


Byte-Granular 
Limit 
24/27' 
2' 
a,c,i,p 


Page-Granular 
Limit 
29/32' 
2' 
a,c,i,p 


LTR = 
Load Tllk 
Regllter 


From Register/Memory 
00001111 
00000000 
ImodOOl 
rim I 
27/31' 
4' 
a,o,e,p 


SOOT - Store Globll 
Descriptor 


Table Register 
00001111 
00000001 
I modOOO 
rim I 
11' 
3' 
a 


SlOT = 
Store Interrupt 
Descriptor 


Table Register 
I 
00001111 
00000001 
!mod001 
rim I 
11' 
3' 
a 


SLOT - 
Store Locol Descriptor 
Table Regllter 


To Register/Memory 
I 00001111 
00000000 
I modOOO 
r/ml 
2/2' 
4' 
a 
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PROTECTION 
CONTROL 
(Continued) 


SMSW = Store Machin. 
SlatusWord 


STR = 
Store Tuk 
R.gl.t.r 


To Register/Memory 


VERR = 
V.rlfy 
R.ad 
Ace •••• 


Register/Memory 


Clock 
Number 


Count. 
olOala 
Not •• 
Cycl •• 


2/2' 
l' 
a,c 


212' 
l' 
a 


10/11" 
2" 
a,c,i,p 


15/16" 
2" 
a,c,i,p 


NOTES: 
a. Exception 
13 fault 
(general 
violation) 
cur if the memory 
operand 
in CS, OS, ES, FS or GS cannot 
be used due to 
either 
a segment 
limit violation 
or access 
ghts violation. 
If a stack 
limit is violated, 
and exception 
12 (stack 
segment 
limit 
violation 
or not present) 
occurs, 


b. For segment 
load operations, 
the CPL, RPL and OPL must agree 
with the privilege 
rules to avoid 
an exception 
13 fault 
(general 
protection 
Violation). 
The segments's 
descriptor 
must indicate 
"present" 
or exception 
11 (CS, OS, ES, FS, GS not 
present). 
If the SS register 
is loaded 
and a stack 
segment 
not present 
is detected, 
an exception 
12 (stack 
segment 
limit 
violation 
or not present 
occurs), 
c, All segment 
descriptor 
accesses 
in the GOT or LOT made by this instruction 
will automatically 
assert 
LOCKiI' 
to maintain 
descriptor 
integrity 
in multiprocessor 
systems. 
d. JMP, 
CALL, 
INT, 
RET 
and 
IRET 
instructions 
referring 
to another 
code 
segment 
will 
cause 
an exception 
13 (general 
protection 
Violation) 
if an applicable 
privilege 
rule is volated. 


e. An exception 
13 fault occurs 
if CPL is greater 
than 
O. 
f. An exception 
13 fault occurs 
if CPL is greater 
than IOPL. 


g. The IF bit of the flag register 
is not updated 
if CPL is greater 
than IOPL. The IOPL field of the flag register 
is updated 
only 
if CPL = O. 
h. Any violation 
of privelege 
rules as applied 
to the selector 
operand 
does not cause 
a protection 
exception; 
rather, 
the zero 
flag is cleared. 
i. If the coprocessor's 
memory 
operand 
violates 
a segment 
limit or 'segment 
access 
rights, 
an exception 
13 fault 
(general 
protection 
exception) 
will occur 
before 
the ESC instruction 
is executed. 
An exception 
12 fault (stack 
segment 
limit violation 
or no present) 
will occur 
if the stack 
limit is violated 
by the operand's 
starting 
address. 


j. The destination 
of a JMP, CALL, 
INT, RET or IRET must be in the defined 
limit of a code segment 
or an exception 
13 fault 
(general 
protection 
violation) 
will occur. 


k, If CPL 
,;; IOPL 
I. If CPL > IOPL 
m. LOCKiI' 
is automatically 
asserted, 
regardless 
of the presence 
or absence 
of the LOCKiI' 
prefix. 


n. The 80376 
uses an early-out 
multiply 
algorithm. 
The actual 
number 
of clocks 
depends 
on the position 
of the most signifi- 
cant bit in the operand 
(multiplier). 
Clock 
counts 
given are minimum 
to maximum. 
To calculate 
actual 
clocks 
use the follow- 
ing formula: 


Actual 
Clock 
= if m < > 0 then 
max ([Iog21mIJ, 
3) + 9 clocks: 


if m = 0 then 12 clocks 
(where 
m is the multiplier) 
0, An exception 
may occur, 
depending 
on the value 
of the operand. 
p. LOCK iI' is asserted 
during 
descriptor 
table accesses. 
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All instruction encodings are subsets of the general 
instruction format shown in Figure 8.1. Instructions 
consist of one or two primary opcode bytes, possibly 
an address specifier consisting of the "mod rim" 
byte and "scaled index" byte, a displacement if re- 
quired, and an immediate data field if required. 


Within the primary opcode or opcodes, smaller en- 
coding fields may be defined. These fields vary ac- 
cording to the class of operation. The fields define 
such information as direction of the operation, size 
of the displacements, register encoding, or sign ex- 
tension. 


Almost all instructions referring to an operand in 
memory have an addressing mode byte following 
the primary opcode byte(s). This byte, the mod rim 
byte, specifies the address mode to be used. Certain 


encodings of the mod rim byte indicate a second 
addressing byte, the scale-index-base byte, follows 
the mod rim 
byte to fully specify the addressing 
mode. 


Addressing modes can include a displacement im- 
mediately following the mod rim byte, or scaled in- 
dex byte. If a displacement is present, the possible 
sizes are 8, 16 or 32 bits. 


If the instruction specifies an immediate operand, 
the immediate operand follows any displacement 
bytes. The immediate operand, if specified, is always 
the last field of the instruction. 


Figure 8.1 illustrates several of the fields that can 
appear in an instruction, such as the mod field and 
the rim field, but the Figure does not show all fields. 
Several smaller fields also appear in certain instruc- 
tions, sometimes within the opcode bytes them- 
selves. Table 8.2 is a complete list of all fields ap- 
pearing in the 80376 instruction set. Further ahead, 
following Table 8.2, are detailed tables for each 
field. 


IT T T T T T T TIT T T T T T T T I mod T T T rim I ss index base Id32116181 none data32116181 
none 
Z 
07 
0}\765320}\765320J\.. 
}\ 
} 
T' 
T 
T 
T 


register and address 
mode specifier 


opcode 
(one or two bytes) 
(T represents an 
opcode bit.) 


"mod rim" 
byte 
"s-i-b" 
byte 
immediate 
data 
(4, 2, 1 bytes 
or none) 


address 
displacement 
(4, 2, 1 bytes 
or none) 


Field Name 
Description 
Number 
of Bits 


w 
Specifies if Data is Byte or Full Size (Full Size is either 16 or 32 Bits 
1 
d 
Specifies Direction of Data Operation 
1 
s 
Specifies if an Immediate Data Field Must be Sign-Extended 
1 
reg 
General Register Specifier 
3 
mod rim 
Address Mode Specifier (Effective Address can be a General Register) 
2 for mod; 
3 for rim 
ss 
Scale Factor for Scaled Index Address Mode 
2 
index 
General Register to be used as Index Register 
3 
base 
General Register to be used as Base Register 
3 
sreg2 
Segment Register Specifier for CS, SS, DS, ES 
2 
sreg3 
Segment Register Specifier for CS, SS, DS, ES, FS, GS 
3 
tt1n 
For Conditional Instructions, Specifies a Condition Asserted 
or a Condition Negated 
4 
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16·Bit Extensions of the 
Instruction Set 


Two prefixes, the Operand Size Prefix (66H) and the 
Effective Address Size Prefix (67H), allow overriding 
individually the Default selection of operand size and 
effective address size. These prefixes may precede 
any opcode bytes and affect only the instruction 
they precede. If necessary, one or both of the prefix- 
es may be placed before the opcode bytes. The 
presence of the Operand Size Prefix and the Effec- 
tive Address Prefix will allow 16-bit data operation 
and 16-bit effective address calculations. 


For instructions with more than one prefix, the order 
of prefixes is unimportant. 


Unless specified otherwise, instructions with a-bit 
and 16-bit operands do not affect the contents of 
the high-order bits of the extended registers. 


Within the instruction are several fields indicating 
register selection, addressing mode and so on. 


ENCODING OF OPERAND LENGTH (w) FIELD 


For any given instruction performing a data opera- 
tion, the instruction will execute as a 32-bit opera- 
tion. Within the constraints of the operation size, the 
w field encodes the operand size as either one byte 
or the full operation size, as shown in the table be- 
low. 


Operand Size 
Operand Size 
wFleld 
During 16-Bit 
During 32-Blt 
Data Operations 
Data Operations 


0 
a Bits 
a Bits 
1 
16 Bits 
32 Bits 


ENCODING OF THE GENERAL 
REGISTER (reg) FIELD 


The general register is specified by the reg field, 
which may appear in the primary opcode bytes, or as 
the reg field of the "mod rim" byte, or as the rim 
field of the "mod rim" byte. 


Encoding of reg Field When w Field 
is not Present In Instruction 


Register Selected 
Register Selected 
reg Field 
During 16-Bit 
During 32·Blt 
Data Operations 
Data Operations 


000 
AX 
EAX 
001 
CX 
ECX 
010 
OX 
EDX 
011 
BX 
EBX 
100 
SP 
ESP 
101 
BP 
EBP 
101 
SI 
ESI 
101 
01 
EDI 


Encoding of reg Field When w Field 
is Present in Instruction 


Register Specified by reg Field 
During 16·Blt Data Operations: 


reg 
Function of w Field 


(when w = 0) 
(whenw 
= 1) 


000 
AL 
AX 
001 
CL 
CX 
010 
DL 
OX 
011 
BL 
BX 
100 
AH 
SP 
101 
CH 
BP 
110 
DH 
SI 
111 
BH 
01 


Register Specified by reg Field 
During 32-Bit Data Operations 


reg 
Function of w Field 


(whenw 
= 0) 
(when w = 1) 


000 
AL 
EAX 
001 
CL 
ECX 
010 
DL 
EDX 
011 
BL 
EBX 
100 
AH 
ESP 
101 
CH 
EBP 
110 
DH 
ESI 
111 
BH 
EDI 
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ENCODING 
OF THE SEGMENT 
REGISTER 
(sreg) 
FIELD 


The sreg field in certain instructions is a 2-bit field 
allowing one of the CS, OS,ES or SS segment regis- 
ters to be specified. The sreg field in other instruc- 
tions is a 3-bit field, allowing the FS and GS segment 
registers to be specified also. 


2·Blt 
Segment 


sreg2 Field 
Register 
Selected 


00 
ES 
01 
CS 
10 
SS 
11 
OS 


3-Blt 
Segment 


sreg3 Field 
Register 
Selected 


000 
ES 
001 
CS 
010 
SS 
011 
OS 
100 
FS 
101 
GS 
110 
do not use 
111 
do not use 


Except for special instructions, such as PUSH or 
POP,where the addressing mode is pre-determined, 
the addressing mode for the current instruction is 
specified by addressing bytes following the primary 
opcode. The primary addressing byte is the "mod 
rim" byte, and a second byte of addressing informa- 
tion, the "s-i-b" 
(scale-index-base) byte, can be 
specified. 


The s-i-b byte (scale-index-base byte) is specified 
when using 32-bit addressing mode and the "mod 
rim" byte has rim = 100 and mod = 00,01 or 10. 
When the sib byte is present, the 32-bit addressing 
mode is a function of the mod, ss, index, and base 
fields. 


The primary addressing byte, the "moQ rim" 
byte, 
also contains three bits (shown as TTT in Figure 8.1) 
sometimes used as an extension of the primary op- 
code. The three bits, however, may also be used as 
a register field (reg). 


When calculating an effective address, either 16-bit 
addressing or 32-bit addressing is used. 16-bit ad- 
dressing uses 16-bit address components to calcu- 
late the effective address while 32-bit addressing 
uses 32-bit address components to calculate the ef- 
fective address. When 16-bit addressing is used, the 
"mod rim" byte is interpreted as a 16-bit addressing 
mode specifier. When 32-bit addressing is used, the 
"mod rim" byte is interpreted as a 32-bit addressing 
mode specifier. 


Tables on the following three pages define all en- 
codings of all 16-bit addressing modes and 32-bit 
addressing modes. 
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mod rIm 
Effective 
Address 


00000 
OS: [EAX] 
00001 
OS: [ECX] 
00010 
OS: [EOX] 
00011 
OS: [EBX] 
00100 
s-i-b is present 
00101 
OS:d32 
00110 
OS: [ESI] 
00 111 
OS:[EOI] 


01000 
OS: [EAX + d8] 
01 001 
OS:[ECX+d8] 
01010 
OS:[EOX+d8] 
01 011 
OS:[EBX+d8] 
01 100 
s-i-b is present 
01 101 
SS:[EBP+d8] 
01 110 
OS:[ESI+d8] 
01 111 
OS: [EOI + d8] 


mod rIm 
Effective 
Address 


10000 
OS: [EAX + d32] 


10001 
OS: [ECX + d32] 
10010 
OS: [EOX + d32] 
10011 
OS: [EBX + d32] 
10100 
s-i-b is present 


10 101 
SS: [EBP + d32] 


10110 
OS: [ESI + d32] 


10111 
OS: [ED I + d32] 


11 000 
register-see 
below 
11 001 
register-see 
below 
11 010 
register-see 
below 
11 011 
register-see 
below 
11 100 
register-see 
below 
11 101 
register-see 
below 
11110 
register-see 
below 
11111 
register-see 
below 


Register 
Specified 
by reg or rIm 
during 
Normal 
Data Operations: 


mod rIm 
function 
of w field 


(whenw=O) 
(whenw= 
1) 


11 000 
AL 
EAX 
11 001 
CL 
ECX 
11 010 
OL 
EOX 
11 011 
BL 
EBX 
11 100 
AH 
ESP 
11 101 
CH 
EBP 
11110 
OH 
ESI 


. 11111 
BH 
EOI 


Register 
Specified 
by reg or rIm 
during 
16·Bit Data Operations: 
(66H Prefix) 


mod rIm 
function 
of w field 


(when w=O) 
(when w= 
1) 


11 000 
AL 
AX 
11 001 
CL 
CX 
11 010 
OL 
OX 
11 011 
BL 
BX 
11 100 
AH 
5P 
11 101 
CH 
BP 
11110 
OH 
51 
11111 
BH 
01 


modr/m 
Effective Address 


00000 
OS:[BX+SI] 
00001 
OS:[BX+OI] 
00010 
SS:[BP+SI] 
00011 
SS:[BP+OI] 
00100 
OS:[SI] 
00101 
OS:[OI] 
00110 
OS:d16 
00 111 
OS:[BX] 


01000 
OS:[BX+ SI + d8] 
01 001 
OS:[BX+ 01+ d8] 
01 010 
SS:[BP+SI+d8] 
01 011 
SS: [BP + 01+ d8] 
01 100 
OS:[SI+d8] 
01 101 
OS:[01+d8] 
01 110 
SS:[BP+d8] 
01 111 
OS:[BX+d8] 


modr/m 
Effective Address 


10000 
OS:[BX+SI+d16] 
10001 
OS:[BX+01+d16] 
10010 
SS:[BP+ SI + d16] 
10011 
SS:[BP+ 
01+ d16] 
10100 
OS:[SI+d16] 
10 101 
OS:[01+ d16] 
10110 
SS:[BP+d16] 
10111 
OS:[BX+d16] 


11 000 
register-see 
below 
11 001 
register-see 
below 
11 010 
register-see 
below 
11 011 
register-see 
below 
11100 
register-see 
below 
11 101 
register-see 
below 
11110 
register-see 
below 
11111 
register-see 
below 


mod base 
Effective 
Address 


00000 
OS: [EAX + (scaled index)] 
00001 
OS: [ECX + (scaled index)] 
00010 
OS: [EoX + (scaled index)] 


00011 
OS: [EBX + (scaled index)] 


00100 
SS: [ESP + (scaled index)] 
00101 
OS: [d32 + (scaled index)] 
00110 
OS: [ESI + (scaled index)] 
00 111 
oS:[Eol + (scaled index)] 


01000 
OS: [EAX + (scaled index) + dB] 
01 001 
OS: [ECX + (scaled index) + dB] 
01 010 
OS: [EoX + (scaled index) + dB] 
01 011 
OS: [EBX + (scaled index) + dB] 
01 100 
SS: [ESP + (scaled index) + dB] 
01 101 
SS: [EBP + (scaled index) + dB] 
01 110 
OS: [ESI + (scaled index) + dB] 
01 111 
OS: [Eol + (scaled index) + dB] 


10000 
OS: [EAX + (scaled index) + d32] 
10001 
OS: [ECX + (scaled index) + d32] 
10010 
OS: [EoX + (scaled index) + d32] 
10011 
OS: [EBX + (scaled index) + d32] 
10100 
SS: [ESP + (scaled index) + d32] 
10 101 
SS: [EBP + (scaled index) + d32] 
10110 
OS: [ESI + (scaled index) + d32] 
10111 
OS: [Eol + (scaled index) + d32] 


NOTE: 
Mod field in "mod rim" 
byte; ss, index, base fields in 


"s-i-b" byte. 


ss 
Scale Factor 


00 
x1 
01 
x2 
10 
x4 
11 
xB 


Index 
Index Register 


000 
EAX 
001 
ECX 
010 
EoX 
011 
EBX 
100 
no index reg·· 
101 
EBP 
110 
ESI 
111 
Eol 


··IMPORTANT NOTE: 
When index field is 100, indicating "no index register," then 
ss field MUST equal 00. If index is 100 and ss does not 
equal DO,the effective address is undefined. 
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ENCODING 
OF OPERATION 
DIRECTION 
(d) FIELD 


In many two-operand 
instructions 
the d field is pres- 
ent 
to 
indicate 
which 
operand 
is considered 
the 
source 
and which 
is the destination. 


d 
Direction 
of Operation 


0 
Register/Memory 
<- - Register 
"reg" 
Field Indicates 
Source Operand; 


"mod r/m" 
or "mod ss index base" 
Indicates 
Destination 
Operand 


1 
Register <- - Register/Memory 
"reg" 
Field Indicates 
Destination 
Operand; 


"mod 
r/m" 
or "mod ss index base" 
Indicates 
Source Operand 


The s field 
occurs 
primarily 
to instructions 
with im- 
mediate 
data fields. The s field has an effect 
only if 
the size of the immediate 
data is 8 bits and is being 
placed 
in a 16-bit or 32·bit destination. 


Effect on 
Effect on 
Immediate 
Data8 
Immediate 
Data 16132 


None 
None 


1 Sign-Extend 
Data8 to Fill 
None 
16-Bit or 32-Bit Destination 


ENCODING 
OF CONDITIONAL 
TEST 
(tttn) 
FIELD 


For the 
conditional 
instructions 
(conditional 
jumps 
and set on condition), 
tttn is encoded 
with n indicat- 
ing to use the condition 
(n = 0) or its negation 
(n = 1), 
and ttt giving the condition 
to test. 


Mnemonic 
Condition 
tttn 
0 
Overflow 
0000 
NO 
No Overflow 
0001 
BINAE 
Below/Not 
Above or Equal 
0010 
NB/AE 
Not Below/Above 
or Equal 
0011 
E/Z 
Equal/Zero 
0100 
NE/NZ 
Not Equal/Not 
Zero 
0101 
BE/NA 
Below or Equal/Not 
Above 
0110 
NBE/A 
Not Below or Equal/ Above 
0111 
S 
Sign 
1000 
NS 
Not Sign 
1001 
PIPE 
Parity/Parity 
Even 
1010 
NP/PO 
Not Parity/Parity 
Odd 
1011 
L/NGE 
Less ThanlNot 
Greater or Equal 
1100 
NL/GE 
Not Less Than/Greater 
or Equal 
1101 
LE/NG 
Less Than or Equal/Greater 
Than 
1110 
NLE/G 
Not Less or Equal/Greater 
Than 
1111 


ENCODING 
OF CONTROL 
OR DEBUG 
REGISTER 
(eee) 
FIELD 


For the loading and storing of the Control and Debug 
registers. 


When Interpreted 
as Control 
Register 
Field 


eee Code 
Reg Name 


000 
CRO 
010 
Reserved 
011 
Reserved 


Do not use any other encoding 


eee Code 
Reg Name 


000 
ORO 
001 
DR1 
010 
DR2 
011 
DR3 
110 
DR6 
111 
DR? 


Do not use any other encoding 
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This 80376 
data sheet, 
version 
-002, contains 
updates 
and improvements 
to previous 
versions. 
A revision 
summary 
is listed here for your convenience. 


Front Page 


Section 
1.0 


Section 
2.0 


Section 
2.1 


Section 
2.1 


Section 2.3 


Section 
2.6 


Section 
2.1;1 


Section 
2.10 


Section 
3.0 


Section 
3.2 


Section 
3.2 


Section 
3.3 


Section 
4.2 


Section 
4.4 


Section 
4.6 


Section 
4.7 


Section 
6.4 


Section 
6.4 


Section 
6.5 


Section 
8.1 


Section 
8.2 


The 80376 
Microarchitecture 
diagram 
was added. 


Figure 1.2 was updated 
to show both top and bottom 
views of the 88-pin 
PGA package. 


Figure 2.0 was updated 
to show the 16-bit registers 
SI, 01, BP and SP. 


Figure 2.2 was updated 
to show the correct 
bit polarity for bit 4 in the CRO register. 


Tables 
2.1 and 2.2 were updated 
to include 
additional 
information 
on the EFLAGs 
and CRO 
registers. 


Figure 2.3 was updated 
to more accurately 
reflect the addressing 
mechanism 
of the 80376. 


In the 
subsection 
Maskable 
Interrupt 
a paragraph 
was added 
to describe 
the 
effect 
of 
interrupt 
gates on the IF EFLAGs 
bit. 


Table 2.7 was updated 
to reflect 
the correct 
power 
up condition 
of the CRO register. 


Figure 2.6 was updated 
to show the correct 
bit positions 
of the BT, BS and BO bits in the 
OR6 register. 


Figure 3.1 was updated 
to clearly 
show the address 
calculation 
process. 


The subsection 
DESCRIPTORS 
was elaborated 
upon to clearly 
define 
the relationship 
be- 
tween the linear address 
space and physical 
address 
space of the 80376. 


Figures 3.3 and 3.4 were updated 
to show the AVL bit field. 


The last sentence 
in the first paragraph 
of subsection 
PROTECTION 
AND 1/0 PERMIS- 


SION BIT MAP was deleted. 
This was an incorrect 
statement. 


In the Subsection 
ADDRESS 
BUS (BHE#, 
BLE#, 
A23-A1 
last sentence 
in the first para- 


graph was updated 
to reflect 
the numerics 
operand 
addresses 
as 8000FCH 
and 8000FEH. 


Because 
the 
80376 
sometimes 
does 
a double 
word 
I/O 
access 
a second 
access 
to 
8000FEH 
can be seen. 


The Subsection 
Hold Lantencles 
was updated 
to describe 
how 32-bit 
and unaligned 
ac- 
cesses 
are internally 
locked 
but do not assert the LOCK # signal. 


Table 4.6 was updated 
to show the correct 
active data bits during a BLE# 
assertion. 


This section 
was updated 
to correctly 
reflect the pipelining 
of the address 
and status of the 
80376 as opposed 
to "Address 
Pipelining" 
which occurs 
on processors 
such as the 80286. 


Table 4.7 was updated 
to show the correct 
Revision 
number, 
05H. 


Table 
4.8 was updated 
to show the numerics 
operand 
register 
8000FEH. 
This address 
is 
seen when the 80376 does a OWORO operation 
to the port address 
8000FCH. 


In the first paragraph 
the case temperatures 
were updated to correctly 
reflect the 0·C-115·C 
for the ceramic 
package 
and 0·C-11 
O·C for the plastic 
package. 


Table 6.2 was updated 
to correctly 
reflect the Case Temperature 
under Bias specification 
of 
- 65·C-120·C. 


Figure 6.8 vertical 
axis was updated 
to reflect 
"Output 
Valid Delay (ns)". 


Figure 6.11 was updated 
to show typical 
Ice vs Frequency 
for the 80376. 


This entire section 
was updated 
to reflect 
the new ICE-376 
emulator. 


The clock 
counts 
and opcodes 
for various 
instructions 
were updated 
to their correct 
value. 


The section 
INSTRUCTION 
ENCODING 
was appended 
to the data sheet. 
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82370 
INTEGRATED SYSTEM PERIPHERAL 
• High Performance 
32-Bit DMA 
• Programmable 
Wait State Generator 
Controller 
for 16-Bit Bus 
- 
0 to 15 Wait States 
Pipelined 
-16 
MBytes/Sec 
Maximum 
Data 
- 
0 to 16 Wait States 
Non-Pipelined 
Transfer 
Rate at 16 MHz 
• DRAM Refresh 
Controller 
- 
8 Independently 
Programmable 
Channels 
• 80376 Shutdown 
Detect 
and Reset 


20-Source 
Interrupt 
Controller 
Control 
• 
- 
Software/Hardware 
Reset 
-Individually 
Programmable 
Interrupt 
Vectors 
• High Speed CHMOS 
III Technology 
-15 
External, 
5 Internal 
Interrupts 
• 100-Pin Plastic Quad Flat-Pack 
Package 
- 
82C59A 
Superset 
and 132-Pin Pin Grid Array 
Package 
• Four 16-Bit Programmable 
Interval 
(See 
Packaging 
Handbook 
Order" 
231369) 


Timers 
• Optimized 
for Use with the 80376 
- 
82C54 Compatible 
Microprocessor 
• Software 
Compatible 
to 82380 
- 
Resides 
on Local Bus for Maximum 
Bus Bandwidth 


The 
82370 
is a multi-function 
support 
peripheral 
that 
integrates 
system 
functions 
necessary 
in an 80376 
environment. 
It has eight channels 
of high performance 
32-bit 
DMA (32-bit 
internal, 
16-bit external) 
with the 
most efficient 
transfer 
rates possible 
on the 80376 bus. System 
support 
peripherals 
integrated 
into the 82370 
provide 
Interrupt 
Control, 
Timers, 
Wait State generation, 
DRAM Refresh 
Control, 
and System 
Reset logic. 


The 82370's 
DMA Controller 
can transfer 
data between 
devices 
of different 
data path widths 
using a single 
channel. 
Each DMA channel 
operates 
independently 
in any of several 
modes. 
Each channel 
has a temporary 
data storage 
register 
for handling 
non-aligned 
data without 
the need for external 
alignment 
logic. 


16 - BIT PHYSICAL 
(32 - BIT LOGICAL) 
8- 
CHANNEL 
DMA 
CONTROLLER 


TIMER 0 


TIMER 1 


TIMER 2 


TIMER 3 


inter 


Pin Descriptions 


The 82370 
provides 
all of the signals 
necessary 
to 


interface 
an 80376 host processor. 
It has a separate 
24-bit address 
and 16-bit data bus. It also has a set 


of control 
signals to support 
operation 
as a bus mas- 


ter or a bus slave. 
Several 
special 
function 
signals 


exist on the 82370 for interfacing 
the system support 
peripherals 
to their respective 
system 
counterparts. 


Following 
are the definitions 
of the individual 
pins of 


the 82370. These 
brief descriptions 
are provided 
as 
a reference. 
Each signal is further 
defined 
within the 
sections 
which describe 
the associated 
82370 func- 
tion. 


Symbol 
Type 
Name and Function 


A1-A23 
I/O 
ADDRESS 
BUS: Outputs 
physical 
memory or port I/O addresses. 
See 
Address 
Bus (2.2.3) for additional 
information. 


BHE# 
I/O 
BYTE ENABLES: 
Indicate which data bytes of the data bus take part in a bus 
BLE# 
cycle. See Byte Enable (2.2.4) for additional 
information. 


00-015 
I/O 
DATA BUS: This is the 16-bit data bus. These pins are active outputs during 
interrupt acknowledges, 
during Slave accesses, 
and when the 82370 is in the 
Master Mode. 


CLK2 
I 
PROCESSOR 
CLOCK: This pin must be connected 
to the processor's 
clock, 
CLK2. The 82370 monitors 
the phase of this clock in order to remain 
synchronized 
with the CPU. This clock drives all of the internal 
synchronous 
circuitry. 


D/C# 
I/O 
DATA/CONTROL: 
D/C# 
is used to distinguish 
between 
CPU control 
cycles 
and DMA or CPU data access cycles. It is active as an output only in the 
Master Mode. 


W/R# 
I/O 
WRITE/READ: 
W /R # is used to distinguish 
between 
write and read cycles. It 
is active as an output only in the Master Mode. 


M/IO# 
I/O 
MEMORY /10: M/IO# 
is used to distinguish 
between 
memory and 10 
accesses. 
It is active as an output only in the Master Mode. 


ADS# 
I/O 
ADDRESS 
STATUS: This signal indicates 
presence 
of a valid address on the 
address 
bus. It is active as output only in the Master Mode. ADS # is active 
during the first T-state where addresses 
and control 
signals are valid. 


NA# 
I 
NEXT ADDRESS: 
Asserted 
by a peripheral 
or memory to begin a pipelined 
address cycle. This pin is monitored 
only while the 82370 is in the Master 
Mode. In the Slave Mode, pipelining 
is determined 
by the current and past 
status of the ADS # and READY # signals. 


HOLD 
0 
HOLD REQUEST: 
This is an active-high 
signal to the Bus Master to request 
control 
of the system bus. When control 
is granted, the Bus Master activates 
the hold acknowledge 
signal (HLDA). 


HLDA 
I 
HOLD ACKNOWLEDGE: 
This input signal tells the DMA controller 
that the 
Bus Master has relinquished 
control 
of the system bus to the DMA controller. 


Symbol 
Type 
Name and Function 


DREQ (0-3, 
5-7) 
I 
DMA REQUEST: 
The DMA Request 
inputs monitor requests 
from peripherals 
requiring 
DMA service. 
Each of the eight DMA channels 
has one DREQ input. 


These active-high 
inputs are internally 
synchronized 
and prioritized. 
Upon 
request, channel 
0 has the highest priority and channel 
7 the lowest. 


DREQ4/IRQ9# 
I 
DMA/INTERRUPT 
REQUEST: 
This is the DMA request input for channel 
4. It 
is also connected 
to the interrupt 
controller 
via interrupt 
request 9. This 
internal connection 
is available 
for DMA channel 
4 only. The interrupt 
input is 
active low and can be programmed 
as either edge or level triggered. 
Either 
function 
can be masked by the appropriate 
mask register. 
Priorities of the 
DMA channel 
and the interrupt 
request are not related but follow the rules of 
the individual 
controllers. 


Note that this pin has a weak internal pull-up. This causes the interrupt 
request to be inactive, but the DMA request will be active if there is no 
external 
connection 
made. Most applications 
will require that either one or the 
other of these functions 
be used, but not both. For this reason, it is advised 
that DMA channel 
4 be used for transfers 
where a software 
request 
is more 
appropriate 
(such as memory-to-memory 
transfers). 
In such an application, 
DREQ4 can be masked by software, 
freeing IRQ9# 
for other purposes. 


EOP# 
I/O 
END OF PROCESS: As an output, this signal indicates 
that the current 
Requester 
access is the last access of the currently 
operating 
DMA channel. 


It is activated 
when Terminal 
Count is reached. 
As an input, it signals the DMA 
channel 
to terminate 
the current buffer and proceed 
to the next buffer, if one 
is available. 
This signal may be programmed 
as an asynchronous 
or 
synchronous 
input. 


EOP# 
must be connected 
to a pull-up resistor. This will prevent erroneous 
external 
requests 
for termination 
of a DMA process. 


EDACK (0-2) 
0 
ENCODED 
DMA ACKNOWLEDGE: 
These signals contain the encoded 
acknowledgment 
of a request for DMA service by a peripheral. 
The binary 
code formed 
by the three signals indicates 
which channel 
is active. Channel 
4 
does not have a DMA acknowledge. 
The inactive state is indicated 
by the 
code 100. During a Requester 
access, 
EDACK presents 
the code for the 
active DMA channel. 
During a Target access, 
EDACK presents 
the inactive 
code 100. 


IRQ(11-23)# 
I 
INTERRUPT 
REQUEST: These are active low interrupt 
request inputs. The 
inputs can be programmed 
to be edge or level sensitive. 
Interrupt 
priorities 
are programmable 
as either fixed or rotating. These inputs have weak internal 
pull-up resistors. 
Unused interrupt 
request inputs should be tied inactive 
externally. 


INT 
0 
INTERRUPT 
OUT: INT signals that an interrupt 
request is pending. 


CLKIN 
I 
TIMER CLOCK INPUT: This is the clock input signal to all of the 82370's 
programmable 
timers. It is independent 
of the system clock input (CLK2). 


TOUT1/REF# 
0 
TIMER 
1 OUTPUT IREFRESH: 
This pin is software 
programmable 
as either 
the direct output of Timer 1, or as the indicator 
of a refresh cycle in progress. 
As REF #, this signal is active during the memory read cycle which occurs 
during refresh. 


Symbol 
Type 
Name and Function 


TOUT2#/IRQ3# 
I/O 
TIMER 2 OUTPUT/INTERRUPT 
REQUEST: 
This is the inverted output of 
Timer 2. It is also connected 
directly to interrupt 
request 3. External hardware 
can use IRQ3 # if Timer 2 is programmed 
as OUT = 0 (TOUT2 # = 1). 


TOUT3# 
0 
TIMER 3 OUTPUT: 
This is the inverted output of Timer 3. 


READY# 
I 
READY INPUT: This active-low 
input indicates 
to the 82370 that the current 
bus cycle is complete. 
READY is sampled 
by the 82370 both while it is in the 
Master Mode, and while it is in the Slave Mode. 


WSC (0-1) 
I 
WAIT STATE CONTROL: 
WSCO and WSC1 are inputs used by the Wait- 
State Generator 
to determine 
the number of wait states required by the 
currently 
accessed 
memory or I/O. The binary code on these pins, cOIT,bined 
with the M/IO# 
signal, selects an internal 
register in which a wait-state 
count 


is stored. The combination 
WSC = 11 disables the wait-state 
generator. 


READYO# 
0 
READY OUTPUT: This is the synchronized 
output of the wait-state 
generator. 


It is also valid during CPU accesses 
to the 82370 in the Slave Mode when the 
82370 requires wait states. READYO# 
should feed directly the processor's 


" 
READY # input. 


RESET 
I 
RESET: This synchronous 
input serves to initialize the state of the 82370 and 
provides 
basis for the CPURST output. RESET must be held active for at least 


15 CLK2 cycles in order to guarantee 
the state of the 82370. After Reset, the 
82370 is in the Slave Mode with all outputs except timers and interrupts 
in 


their inactive states. The state of the timers and interrupt 
controller 
must be 
initialized through 
software. 
This input must be active for the entire time 


required by the host processor 
to guarantee 
proper reset. 


CHPSEL# 
0 
CHIP SELECT: This pin is driven active whenever 
the 82370 is addressed 
in a 
slave bus read or write cycle. It is also active during interrupt 
acknowledge 
cycles when the 82370 is driving the Data Bus. It can be used to control 
the 
local bus transceivers 
to prevent 
contention 
with the system bus. 


CPURST 
0 
CPU RESET: CPURST provides 
a synchronized 
reset signal for the CPU. It is 
activated 
in the event of a software 
reset command, 
a processor 
shut-down 
detect, or a hardware 
reset via the RESET pin. The 82370 holds CPURST 
active for 62 clocks in response 
to either a software 
reset command 
or a shut- 


',' 
down detection. 
Otherwise 
CPURST reflects 
the RESET input. 


Vcc 
POWER: + 5V input power. 


Vss 
Ground Reference. 


Port 
Walt-State 
Registers 
Select Inputs 


Address 
07 
04 
03 
DO 
WSC1 
WSCO 


72H 
MEMORYO 
1/00 
0 
0 
73H 
MEMORY 
1 
I/O 1 
0 
1 
74H 
MEMORY 
2 
1/02 
1 
0 
DISABLED 
1 
1 


M/IO# 
1 
0 


inter 


A Row 
BRow 
CRow 
DRow 


Pin 
Label 
Pin 
Label 
Pin 
Label 
Pin 
Label 


1 
CPURST 
26 
Vcc 
51 
Al1 
76 
OREQ5 
2 
INT 
27 
011 
52 
AlO 
77 
OREQ4/1RQ9# 


3 
Vcc 
28 
04 
53 
A9 
78 
OREQ3 
4 
Vss 
29 
012 
54 
As 
79 
OREQ2 
5 
TOUTU 
IIRQ3# 
30 
05 
55 
A7 
80 
OREQ1 


6 
TOUT3# 
31 
013 
56 
A6 
81 
OREQO 
7 
O/C# 
32 
06 
57 
A5 
82 
IRQ23# 


8 
Vcc 
33 
Vss 
58 
Vcc 
83 
IRQ22# 


9 
W/R# 
34 
014 
59 
A4 
84 
IRQ21# 


10 
MlIO# 
35 
07 
60 
A3 
85 
IRQ20# 


11 
HOLD 
36 
015 
61 
A2 
86 
IRQ19# 


12 
TOUT1/REF# 
I 
37 
A23 
62 
A1 
87 
IRQ18# 


13 
CLK2 
38 
A22 
63 
Vss 
88 
IRQ17# 


14 
Vss 
39 
A21 
64 
BLE# 
89 
IRQ16# 


15 
REAOYO# 
. 
40 
A20 
65 
BHE# 
90 
IRQ15# 


16 
EOP# 


-,. 
41 
A19 
66 
Vss 
91 
IRQ14# 


17 
CHPSEL# 
42 
A1S 
67 
AOS# 
92 
IRQ13# 


18 
Vcc 
< 
43 
Vcc 
68 
Vcc 
93 
IRQ12# 


19 
Do 
, 
44 
A17 
69 
EOACK2 
94 
IRQ11# 


20 
Os 
45 
A16 
70 
EOACK1 
95 
CLKIN 
21 
01 
46 
A15 
71 
EDACKO 
96 
WSCO 
22 
09 
47 
A14 
72 
HLOA 
97 
WSC1 
23 
02 
48 
Vss 
73 
OREQ7 
98 
RESET 


24 
010 
49 
A13 
74 
OREQ6 
99 
READY # 


25 
03 
50 
A12 
75 
NA# 
100 
Vss 


-A 
~ 
<; 
LI 
t. 
U 
M 
.I 
1"\ ~n?.,....r"\."q~ 
"r'\"~~~~'" 
I"\~~f"'\ 


VSS 
VCC 
VSS 
VCC 
A12 
A9 
A8 
A5 
A3 
BHE# 
oREQo 
EDACKl 
VSS 
VCC 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 


VCC 
"A19 
A17 
A15 
A13 
Al0 
A7 
A4 
A1 
AoS# 
EoACK2 
INT 
VSS 
VCC 


2 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 


oREQ4/ 
oREQ5 
VSS 
A21 
A18 
A16 
A14 
All 
A6 
A2 
BlE# 
IRQ9# 
EDACKO 
HloA 
oREQ7 


3 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 


VCC 
A22 
A20 
oREQ6 
NA# 
oREQ3 


4 
0 
0 
0 
0 
0 
0 


(NC) 
(NC) 
A23 
WSCO 
oREQ2 
oREQl 


5 
0 
0 
0 
0 
0 
0 


BonOM 
VIEW 
(NC) 
(NC) 
(NC) 
METAL LID 
WSCl 
IRQ22# 
IRQ23# 


6 
0 
0 
0 
0 
0 
0 


(NC) 
(NC) 
(NC) 
IRQ21# 
IRQ20# 
IRQ19# 


7 
0 
0 
0 
0 
0 
0 


(NC) 
(NC) 
015 
(82370) 
IRQ17# 
IRQI6# 
IRQ18# 


8 
0 
0 
0 
0 
0 
O· 


07 
(NC) 
(NC) 
IRQ13# 
IRQU# 
IRQ15# 


9 
0 
0 
0 
0 
0 
0 


014 
06 
013 
O/C# 
IRQ12# 
IRQll# 


10 
0 
0 
0 
0 
0 
0 


(NC) 
05 
(NC) 
REAoy# 
CLKIN 
W/R# 


11 
0 
0 
0 
0 
0 
0 


Vcc 
(NC) 
012 
(NC) 
03 
010 
(NC) 
REAoyO# 
HOLD 
CHPSEl# 
EOP# 
CPURST 
RESET 
Vcc 


12 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 


(NC) 
(NC) 
(NC) 
(NC) 
(NC) 
TOUT1/ 
M/IO# 
TOUT2#/ 
Vss 
04 
02 
09 
REF# 
TOUT3# 
IRQ3 
Vss 


13 
0 
0 
0 
0 
O. 
0 
0 
0 
0 
0 
0 
0 
0 
0 


Vcc 
Vss 
Vcc 
011 
(NC) 
(NC) 
ClK2 
01 
DO 
08 
Vss 
Vcc 
Vss 
Vcc 


14 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
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82370 PGA Pinout 


inter 


Pin 
Label 
Pin 
Label 
Pin 
Label 
Pin 
Label 


G14 
CLK2 
D14 
D11 
L1 
DREQO 
A2 
Vcc 
N12 
RESET 
F12 
Dl0 
P6 
IRQ2U 
P2 
Vcc 
M12 
CPURST 
G13 
Dg 
N6 
IRQ2U 
A4 
Vcc 
C5 
A23 
K14 
De 
M7 
IRQ21 # 
A12 
Vcc 
84 
A22 
A9 
D7 
N7 
IRQ20# 
P12 
Vcc 
83 
A2l 
810 
Ds 
P7 
IRQ19# 
A14 
Vcc 
C4 
A20 
811 
D5 
P8 
IRQ18# 
C14 
Vcc 
82 
A19 
C13 
D4 
M8 
IRQ1U 
M14 
Vcc 
C3 
Ale 
E12 
D3 
N8 
IRQ16# 
P14 
Vcc 
C2 
A17 
F13 
D2 
P9 
IRQ15# 
A5 
NC 
D3 
A1S 
H14 
Dl 
N9 
IRQ14# 
85 
NC 
D2 
A15 
J14 
Do 
M9 
IRQ1U 
A6 
NC 
E3 
A14 
P11 
WfR# 
N10 
IRQ1U 
86 
NC 
E2 
A13 
L13 
MfIO# 
P10 
IRQ11 # 
C6 
NC 
E1 
A12 
K2 
ADS# 
M5 
WSCO 
A7 
NC 
F3 
All 
M10 
DfC# 
M6 
WSC1 
87 
NC 
F2 
AlO 
N4 
NA# 
M13 
TOUTU 
C7 
NC 
F1 
Ag 
M11 
READY# 
N13 
TOUTU 
flRQ3 # 
A8 
NC 
G1 
Ae 
H12 
READYO# 
K13 
TOUT1fREF# 
88 
NC 
G2 
A7 
J12 
HOLD 
N11 
CLKIN 
89 
NC 
G~ 
As 
M3 
HLDA 
A1 
Vss 
C9 
NC 
H1 
A5 
M2 
INT 
C1 
Vss 
A11 
NC 
H2 
A4 
L12 
EOP# 
N1 
Vss 
811 
NC 
J1 
A3 
L2 
EDACK2 
N2 
Vss 
C11 
NC 
H3 
A2 
M1 
EDACK1 
A3 
Vss 
D12 
NC 
J2 
Al 
L3 
EDACKO 
A13 
Vss 
G12 
NC 
J3 
8LE# 
N3 
DREQ7 
P13 
Vss 
813 
NC 
K1 
8HE# 
M4 
DREQ6 
814 
Vss 
D13 
NC 
K12 
CHPSEL# 
P3 
DREQ5 
L14 
Vss 
E13 
NC 
C8 
D15 
K3 
DREQ4fIRQ9# 
N14 
Vss 
H13 
NC 
A10 
D14 
P4 
DREQ3 
81 
Vcc 
J13 
NC 
C10 
D13 
N5 
DREQ2 
D1 
Vcc 
E14 
NC 
C12 
D12 
P5 
DREQ1 
P1 
Vcc 
F14 
NC 


The 82370 
contains 
several 
independent 
functional 
modules. 
The following 
is a brief discussion 
of the 
components 
and features 
of the 82370. 
Each mod- 
ule has a corresponding 
detailed 
section 
later in this 
data sheet. Those sections 
should be referred 
to for 
design 
and programming 
information. 


The 82370 is comprised 
of several computer 
system 
functions 
that 
are 
normally 
found 
in separate 
LSI 
and 
VLSI 
components. 
These 
include: 
a high-per- 
formance, 
eight-channel, 
32-bit 
Direct 
Memory 
Ac- 
cess 
Controller; 
a 20-level 
Programmable 
Interrupt 


Controller 
which 
is a superset 
of the 82C59A; 
four 
16-bit Programmable 
Interval Timers which are func- 
tionally equivalent 
to the 82C54 timers; a DRAM Re- 
fresh Controller; 
a Programmable 
Wait State Gener- 
ator; 
and 
system 
reset 
logic. 
The 
interface 
to the 
82370 
is optimized 
for high-performance 
operation 
with the 80376 
microprocessor. 


The 82370 
operates 
directly 
on the 80376 
bus. In 
the Slave Mode, it monitors 
the state of the proces- 


sor at all times 
and acts 
or idles according 
to the 
commands 
of the host. It monitors 
the address 
pipe- 
line status 
and generates 
the programmed 
number 
of wait 
states 
for the device 
being 
accessed. 
The 
82370 
also has logic to the reset of the 80376 
via 
hardware 
or software 
reset requests 
and processor 
shutdown 
status. 


After a system reset, the 82370 is in the Slave 
Mode. It appears to the system as an I/O device. It 
becomes a bus master when it is performing DMA 
transfers. 


are automatically inserted into the access cycle. 
This allows the programmer to write initialization rou- 
tines, etc. without regard to hardware recovery 
times. 


To maintain compatibility with existing software, the 
registers within the 82370 are accessed as bytes. If 
the internal logic of the 82370 requires a delay be- 
fore another access by the processor, wait states 


Figure 1-1 shows the basic architectural compo- 
nents of the 82370. The following sections briefly 
discuss the architecture and function of each of the 
distinct sections of the 82370. 


r--------- 
I 
I 
HOLD 
HOLDA ,.- 
1~;:::~:lTI~~S 


CLK2 ~ 
AND CONTROL 
I 


READY# 
READYO# 
WSCO 
WSCl 


DREQ7 


EDACKO 


EDACKI 


EDACK2 


EOP# 


13 
IRQ# 


INT 


TOUT2# 


TOUT3# 


inter 


HOLD 
CONTROL/STATUS 
REGISTERS 
CHANNEL 
REGISTERS 
HLDA 
COMMAND REGISTER I 
BASE 
CURRENT 
TEMPORARY 
COMMAND REGISTER II 
BYTE COUNT 
BYTE COUNT 
REGISTER 
DREQO 
DREQl 
MODE REGISTER I 
BASE 
CURRENT 


DREQ2 
REQUESTER 
REQUESTER 
CHANNEL 
0 
MODE REGISTER II 
ADDRESS 
ADDRESS 
DREQ3 
SOFTWARE REQUEST 
BASE 
CURRENT 
DREQ4 
REGISTER 
TARGET 
TARGET 
DREQS 
MASK REGISTER 
ADDRESS 
ADDRESS 
DREQ6 
STATUS REGISTER 
CHANNEL 
1 (SAME AS CH 0) 
DREa7 
BUS SIZE REGISTER 
CHANNEL 
2 (SAME AS CH 0) 


CHAINING REGISTER 
CHANNEL 
3 (SAME AS CH 0) 


"LOWER" 
GROUP OF CHANNELS 


EDACKO 


EDACKl 
PROCESS 
CONTROL 
EDACK2 


EOPH 


CONTROL/STATUS 
(SAME AS 
LOWER GROUP) 
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Figure 1·2.82370 DMA Controller 
4-134 


The 82370 contains a high-performance, 8-channel 
DMA Controller. It provides a 32-bit internal data 
path. Through its 16-bit external physical data bus, it 
is capable of transferring data in any combination of 
bytes, words ana double-words. The addresses of 
both source and destination can be independently 
incremented, decremented or held constant, and 
cover the entire 16-bit physical address space of the 
80376. It can disassemble and assemble non- 
aligned data via a 32-bit internal temporary data 
storage register. Data transferred between devices 
of different data path widths can also be assembled 
and disassembled using the internal temporary data 
storage register. The DMA Controller can also trans- 
fer aligned data between I/O and memory on the fly, 
allowing data transfer rates up to 16 megabytes per 
second for an 82370 operating at 16 MHz. Figure 
1-2 illustrates the functional components of the DMA 
Controller. 


There are twenty-four general status and command 
registers in the 82370 DMA Controller. Through 
these registers any of the channels may be pro- 
grammed into any of the possible modes. The oper- 
ating modes of anyone channel are independent of 
the operation of the other channels. 


Each channel has three programmable registers 
which determine the location and amount of data to 
be transferred: 
Byte Count Register- 
Number of bytes to trans- 
fer. (24-bits) 


Requester Register - 
Byte Address of memory 
or peripheral which is re- 
questing 
DMA 
service. 


(24-bits) 
Target Register 
- 
Byte Address of peripheral 
or memory which will be 
accessed. (24-bits) 


There are also port addresses which, when ac- 
cessed, cause the 82370 to perform specific func- 
tions. The actual data written doesn't matter, the act 
of writing to the specific address causes the com- 
mand to be executed. The commands which operate 
in this mode are: Master Clear, Clear Terminal Count 
Interrupt Request, Clear Mask Register, and Clear 
Byte Pointer Flip-Flop. 


DMA transfers can be done between all combina- 
tions of memory and I/O; memory-to-memory, mem- 
ory-to-1/0, 
I/O-to-memory, 
and I/O-to-I/O. 
DMA 
service can be requested through software and/or 
hardware. Hardware DMA acknowledge signals are 
available for all channels (except channel 4) through 
an 
encoded 
3-bit 
DMA 
acknowledge 
bus 
(EDACKO-2). 


The 82370 
DMA Controller 
transfers 
blocks 
of data 
(buffers) 
in three 
modes: 
Single Buffer, 
Buffer Auto- 
Initialize, 
and 
Buffer 
Chaining. 
In the Single 
Buffer 
Process, 
the 82370 
DMA Controller 
is programmed 
to transfer 
one particular 
block of data. Successive 
transfers 
then 
require 
reprogramming 
of the 
DMA 
channel. 
Single 
Buffer 
transfers 
are useful 
in sys- 
tems where 
it is known 
at the time the transfer 
be- 
gins what 
quantity 
of data is to be transferred, 
and 
there is a contiguous 
block of data area available. 


The 
Buffer 
Auto-Initialize 
Process 
allows 
the same 
data area to be used for successive 
DMA transfers 
without 
having to reprogram 
the channel. 


The 
Buffer 
Chaining 
Process 
allows 
a program 
to 
specify 
a list of buffer transfers 
to be executed. 
The 
82370 
DMA Controller, 
through 
interrupt 
routines, 
is 
reprogrammed 
from 
the list. The channel 
is repro- 
grammed 
for a new buffer 
before 
the current 
buffer 
transfer 
is complete. 
This pipelining 
of the channel 
programming 
process 
allows 
the system 
to allocate 
non-contiguous 
blocks 
of data storage 
space, 
and 
transfer 
all of the data with one DMA process. 
The 
buffers 
that make up the chain do not have to be in 
contiguous 
locations. 


Channel 
priority can be fixed or rotating. 
Fixed priori- 
ty allows 
the 
programmer 
to define 
the 
priority 
of 
DMA channels 
based on hardware 
or other fixed pa- 


rameters. 
Rotating 
priority 
is used to provide 
periph- 
erals access 
to the bus on a shared 
basis. 


With 
fixed 
priority, 
the 
programmer 
can 
set 
any 
channel 
to have the current 
lowest 
priority. 
This al- 
lows the user to reset or manually 
rotate the priority 
schedule 
without 
reprogramming 
the command 
reg- 
isters. 


Four 
16-bit 
programmable 
interval 
timers 
reside 
within the 82370. These timers are identical 
in func- 


tion to the timers in the 82C54 Programmable 
Inter- 
val Timer. 
All four 
of the 
timers 
share 
a common 
clock input which can be independent 
of the system 
clock. The timers are capable 
of operating 
in six dif- 
ferent 
modes. 
In all of the modes, 
the current 
count 
can be latched 
and read by the 80376 
at any time, 
making these very versatile 
event timers. 
Figure 
1-3 
shows 
the functional 
components 
of the 
Program- 
mable Interval 
Timers. 
- 


The outputs of the timers are directed 
to key system 
functions, 
making 
system 
design 
simpler. 
Timer 0 is 
routed 
directly 
to an interrupt 
input and is not avail- 
able externally. 
This timer would typically 
be used to 
generate 
time-keeping 
interrupts. 


TIMER 0 


TIMER 1 


inter 


Timers 1 and 2 have outputs which are available for 
general timer/counter purposes as well as special 
functions. Timer 1 is routed to the refresh control 
logic to provide refresh timing. Timer 2 is connected 
to an interrupt request input to provide other timer 
functions. Timer 3 is a general purpose timer/ coun- 
ter whose output is available to external hardware. It 
is also connected internally to the interrupt request 
which defaults to the highest priority (lRQO). 


The 82370 has the equivalent of three enhanced 
82C59A Programmable Interrupt Controllers. These 
controllers can all be operated in the Master Mode, 
but the priority is always as if they were cascaded. 
There are 15 interrupt request inputs provided for 
the user, all of which can be inputs from external 
slave interrupt controllers. Cascading 82C59As to 
these request inputs allows a possible total of 120 
external interrupt requests. Figure 1-4 is a block dia- 
gram of the 82370 Interrupt Controller. 


Each of the interrupt request inputs can be individu- 
ally programmed with its own interrupt vector, allow- 
ing more flexibility in interrupt vector mapping than 


IROQ# 
IR01# 
IR02# 
IR03# 
IR04# 
IROS# 
IR06# 
IR07# 


was available with the 82C59A. An interrupt is pro- 
vided to alert the system that an attempt is being 
made to program the vectors in the method of the 
82C59A. This provides compatibility of existing soft- 
ware that used the 82C59A or 8259A with new de- 
signs using the 82370. 


In the event of an unrequested or otherwise errone- 
ous interrupt acknowledge cycle, the 82370 Interrupt 
Controller issues a default vector. This vector, pro- 
grammed by the system software, will alert the sys- 
tem of unsolicited interrupts of the 80376. 


The functions of the 82370 Interrupt Controller are 
identical to the 82C59A, except in regards to pro- 
gramming the interrupt vectors as mentioned above. 
Interrupt request inputs are programmable as either 
edge or level triggered and are software maskable. 
Priority can be either fixed or rotating and interrupt 
requests can be nested. 


Enhancements are added to the 82370 for cascad- 
ing external interrupt controllers. Master to Slave 
handshaking takes place on the data bus, instead of 
dedicated cascade lines. 


IN- 
SERVICE 
REG. 


1.1.4 WAIT 
STATE 
GENERATOR 


The 
Wait 
State 
Generator 
is 
a 
programmable 
READY generation circuit for the 80376 bus. A p~- 
ripheral requiring wait states can request the Walt 
State Generator to hold the processor's READY in- 
put inactive for a predetermined number of bus 
states. Six different wait state counts can be pro- 
grammed into the Wait State Generator by software; 
three for memory accesses and three for I/O ac- 
cesses. A block diagram of the 82370 Wait State 
Generator is shown in Figure 1-5. 


The peripheral being accessed selects the re~uire? 
wait state count by placing a code on a 2-bIt walt 
state select bus. This code along with the M/IO# 
signal from the bus master is used to select one of 
six internal 4-bit wait state registers which has been 
programmed with the desired number of wait states. 
From zero to fifteen wait states can be programmed 
into the wait state registers. The Wait State genera- 
tor tracks the state of the processor or current bus 
master at all times, regardless of which device is the 
current bus master and regardless of whether or not 
the wait state generator is currently active. 


The 82370 Wait State Generator is disabled by mak- 
ing the select inputs both high. This allows hardware 
which is intelligent enough to generate its own ready 
signal to be accessed without penalty. As previously 
mentioned, deselecting the Wait State Generator 
does not disable its ability to determine the proper 
number of wait states due to pipeline status in sub- 
sequent bus cycles. 


The number of wait states inserted into a pipelined 
bus cycle is the value in the selected wait state reg- 
ister. If the bus master is operating in the non-pipe- 
lined mode, the Wait State Generator will increase 
the number of wait states inserted into the bus cycle 
by one. 


On reset the Wait State Generator's registers are 
loaded ~ith the value FFH, giving the maximum 
number of wait states for any access in which the 
wait state select inputs are active. 


1.1.5 DRAM 
REFRESH 
CONTROLLER 


The 82370 DRAM Refresh Controller consists of a 
24-bit refresh address counter and bus arbitration 
logic. The output of Timer 1 is used to periodically 
request a refresh cycle. When the controller re- 
ceives the request, it requests access to the syste,:" 
bus through the HOLD signal. When bus control 
IS 
acknowledged by the processor or current bus mas- 
ter, the refresh controller executes a memory read 
operation at the address currently in the Refresh Ad- 
dress Register. At the same time, it activates a re- 
fresh signal (REF#) that the memory uses to force a 
refresh instead of a normal read. Control of the bus 
is transferred to the processor at the completion of 
this cycle. Typically a refresh cycle will take six clock 
cycles to execute on an 80376 bus. 


The 82370 DRAM Refresh Controller has the high- 
est priority when requesting bus access and will in- 
terrupt any active DMA process. This allows large 
blocks of data to be moved by the DMA controller 
without affecting the refresh function. Also the DMA 
controller is not required to completely relinquish the 
bus, the refresh controller simply steals a bus cycle 
between DMA accesses. 


The amount by which the refresh address is incre- 
mented is programmable to allow for different bus 
widths and memory bank arrangements. 


1.1.6 CPU RESET 
FUNCTION 


The 82370 contains a special reset function which 
can respond to hardware reset signals as well as a 
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software reset command. The circuit will hold the 
80376's RESET line active while an external hard- 
ware reset signal is present at its RESET input. It 
can also reset the 80376 processor as the result of a 
software command. The software reset command 
causes the 82370 to hold the processor's RESET 
line active for a minimum of 62 clock cycles. The 
80376 requires that its RESET line be held active for 
a minimum of 80 clock cycles to re-initialize. For a 
more detailed explanation and solution, see Appen- 
dix D (System Notes). 


The 82370 can be programmed to sense the shut- 
down detect code on the status lines from the 
80376. If the Shutdown Detect function is enabled, 
the 82370 will automatically reset the processor. A 
diagnostic register is available which can be used to 
determine the cause of reset. 


After a hardware reset, the internal registers of the 
82370 are located in I/O space beginning at port 
address OOOOH.The map of the 82370's registers is 
relocatable via a software command. The default 
mapping places the 82370 between I/O addresses 
OOOOHand OODBH.The relocation register allows 
this map to be moved to any even 256-byte bounda- 
ry in the processor's 16-bit I/O address space or any 
even 64 kbyte boundary in the 24-bit memory ad- 
dress space. 


The 82370 is designed to operate efficiently on the 
local bus of an 80376 microprocessor. The control 
signals of the 82370 are identical in function to 
those of the 80376. As a slave, the 82370 operates 
with all of the features available on the 80376 bus. 
When the 82370 is in the Master Mode, it looks iden- 
tical to an 80376 to the connected devices. 


The 82370 monitors the bus at all times, and deter- 
mines whether the current bus cycle is a pipelined or 
non-pipelined access. All of the status signals of the 
processor are monitored. 


The control, status, and data registers within the 
82370 are located at fixed addresses relative to 
each other, but the group can be relocated to either 
memory or I/O space and to different locations with- 
in those spaces. 


As a Slave device, the 82370 monitors the control/ 
status lines of the CPU. The 82370 will generate all 
of the wait states it needs whenever it is accessed. 
This allows the programmer the freedom of access- 


ing 82370 registers without having to insert NOPs in 
the program to wait for slower 82370 internal regis- 
ters. 


The 82370 can determine if a current bus cycle is a 
pipelined or a non-pipelined cycle. It does this by 
monitoring the ADS#, NA# and READY# signals 
and thereby keeping track of the current state of the 
80376. 


As a bus master, the 82370 looks like an 80376 to 
the rest of the system. This enables the designer 
greater flexibility 
in systems which 
include the 
82370. The designer does not have to alter the inter- 
faces of any peripherals designed to operate with 
the 80376 to accommodate the 82370. The 82370 
will access any peripherals on the bus in the same 
manner as the 80376, including recognizing pipe- 
lined bus cycles. 


The 82370 is accessed as an 8-bit peripheral. The 
80376 places the data of all 8-bit accesses either on 
D(0-7) or D(8-15). The 82370 will only accept data 
on these lines when in the Slave Mode. When in the 
Master Mode, the 82370 is a full 16-bit machine, 
sending and receiving data in the same manner as 
the 80376. 


The 82370 contains a set of interface signals to op- 
erate efficiently with the 80376 host processor. 
These signals were designed so that minimal hard- 
ware is needed to connect the 82370 to the 80376. 
Figure 2-1 depicts a typical system configuration 
with the 80376 processor. As shown in the diagram, 
the 82370 is designed to interface directly with the 
80376 bus. 


Since the 82370 resides on the opposite side of the 
data bus transceivers with respect to the rest of the 
system peripherals, it is important to note that the 
transceivers should be controlled so that contention 
between the data bus transceivers and the 82370 
will not occur. In order to ease the implementation of 
this, the 82370 activates the CHPSEL# signal which 
indicates that the 82370 has been addressed and 
may output data. This signal should be included in 
the direction and enable control logic of the trans- 
ceiver. Wnen any of the 82370 internal registers are 
read, the data bus transceivers should be disabled 
so that only the 82370 will drive the local bus. 


This section describes the basic bus functions of the 
82370 to show how this device interacts with the 
80376 processor. Other signals which are not direct- 
ly related to the host interface will be discussed in 
their associated functional block description. 
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At any time. the 82370 acts as either a Slave device 
or a Master device in the system. Upon reset. the 
82370 will be in the Slave Mode. In this mode. the 
80376 processor can read/write into the 82370 in- 
ternal registers. Initialization information may be pro- 
grammed into the 82370 during Slave Mode. 


When DMA service (including DRAM Refresh Cycles 
generated by the 82370) is requested, the 82370 will 
request and subsequently get control of the 80376 
local bus. This is done through the HOLD and HLDA 
(Hold Acknowledge) signals. When the 80376 proc- 


essor responds by asserting the HLDA signal. the 
82370 will switch into Master Mode and perfor", 
DMA transfers. In this mode, the 82370 is the bus 
master of the system. It can read/write data from/to 
memory and peripheral devices. The 82370 will re- 
turn to the Slave Mode upon completion of DMA 
transfers, or when HLDA is negated. 


As mentioned in the Architecture section, the Bus 
Interface module of the 82370 (see Figure 1-1) con- 
tains signals that are directly connected to the 
80376 host processor. This module has separate 


16-bit Data and 24-bit Address busses. Also, it has 
additional control signals to support different bus op- 
erations on the system. By residing on the 80376 
local bus, the 82370 shares the same address, data 
and control lines with the processor. The following 
subsections discuss the signals which interface to 
the 80376 host processor. 


2.2.1 CLOCK (CLK2) 


The CLK2 input provides fundamental timing for the 
82370. It is divided by two internally to generate the 
82370 internal clock. Therefore, CLK2 should be 
driven with twioe the 80376's frequency. In order to 
maintain synchronization with the 80376 host proc- 
essor, the 82370 and the 80376 should share a 
common clock source. 


The internal clock consists of two phases: PHI1 and 
PHI2. Each CLK2 period is a phase of the internal 
clock. PHI2 is usually used to sample input and set 
up internal signals and PHI1 is for latching internal 
data. Figure 2-2 illustrates the relationship of CLK2 
and the 82370 internal clock signals. The CPURST 
signal generated by the 82370 guarantees that the 
80376 will wake up in phase with PHI1. 


2.2.2 DATA BUS (00-015) 


This 16-bit three-state bidirectional bus provides a 
general purpose data path between the 82370 and 
the system. These pins are tied directly to the corre- 
sponding Data Bus pins of the 80376 local bus. The 
Data Bus is also used for interrupt vectors generated 
by the 82370 in the Interrupt Acknowledge cycle. 


During Slave I/O operations, the 82370 expects a 
single byte to be written or read. When the 80376 
host processor writes into the 82370, either 00-07 
or 08-015 will be latched into the 82370, depending 


upon whether Byte Enable bit BLE# is 0 or 1 (see 
Table 2-1). When the 80376 host processor reads 
from the 82370, the single byte data will be duplicat- 
ed twice on the Data Bus; Le. on 00-07 
and 08- 
015· 


During Master Mode, the 82370 can transfer 16-, 
and 8-bit data between memory (or I/O devices) and 
I/O devices (or memory) via the Data Bus. 


These three-state bidirectional signals are connect- 
ed directly to the 80376 Address Bus. In the Slave 
Mode, they are used as input signals so that the 
processor can address the 82370 internal ports/reg- 
isters. In the Master Mode, they are used as output 
signals by the 82370 to address memory and periph- 
eral devices. The Address Bus is capable of ad- 
dressing 16 Mbytes of physical memory space 
(OOOOOOH 
to FFFFFFH), and 64 Kbytes of I/O ad- 
dresses. 


2.2.4 BYTE ENABLE (BHE#, BLE#) 


The Byte Enable pins BHE# and BLE# select the 
specific byte(s) in the word addressed by A1-A23. 
During Master Mode operation, it is used as an out· 
put by the 82370 to address memory and I/O loca· 
tions. The definition of BHE# and BLE# is further 
illustrated in Table 2-1. 


NOTE: 
The 82370 will activate BHE# when output in Mas- 
ter Mode. For a more detailed explanation and its 
solutions, see Appendix D (System Notes). 
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Byte to be Accessed 
Logical Byte Presented on 
BHE# 
BLE# 
Data Bus During WRITE Only' 
Relative to A23-A1 
015-08 
07-00 


0 
0 
0, 1 
B 
A 
0 
1 
1 
A 
A 
1 
0 
0 
U 
A 
1 
1 
(Not Used) 


U = Undefined 
A = Logical 00-07 
B = Logical 08-0'5 


'NOTE: 
Actual number of bytes accessed depends upon the programmed data path width. 


M/IO# 
D/C# 
W/R# 
As INPUTS 
As OUTPUTS 


0 
0 
0 
Interrupt Acknowledge 
NOT GENERATED 
0 
0 
1 
UNDEFINED 
NOT GENERATED 
0 
1 
0 
I/O Read 
I/O Read 
0 
1 
1 
I/O Write 
I/O Write 
1 
0 
0 
UNDEFINED 
NOT GENERATED 
1 
0 
1 
HALT if A1 = 1 
NOT GENERATED 


. 
SHUTDOWN if A1 = 0 
1 
1 
0 
Memory Read 
Memory Read 
1 
1 
1 
Memory Write 
Memory Write 


2.2.5 BUS CYCLE DEFINITION SIGNALS 
(D/C#, 
W/R#, 
M/IO#) 


These three-state bidirectional signals define the 
type of bus cycle being performed. W/R# 
distin- 
guishes between write and read cycles. D/C# 
dis- 
tinguishes between processor data and control cy- 
cles. M/IO# distinguishes between memory and I/O 
cycles. 


During Slave Mode, these signals are driven by the 
80376 host processor; during Master Mode, they are 
driven by the 82370. In either mode, these signals 
will be valid when the Address Status (ADS#) is 
driven LOW. Exact bus cycle definitions are given in 
Table 2-2. Note that some combinations are recog- 
nized as inputs, but not generated as outputs. In the 
Master Mode, D/C# 
is always HIGH. 


2.2.6 ADDRESS STATUS (ADS#) 


This signal indicates that a valid address (A1-A23' 
BHE#, 
BLE#) 
and bus cycle definition (W/R#, 
D/C#, 
M/IO#) 
is being driven on the bus. In the 
Master Mode, it is driven by the 82370 as an output. 
In the Slave Mode, this signal is monitored as 


an input by the 82370. By the current and past 
status of ADS# and the READY# input, the 82370 
is able to determine, during Slave Mode, if the next 
bus cycle is a pipelined address cycle. ADS# is as- 
serted during T1 and T2P bus states (see Bus State 
Definition). 


NOTE: 


ADS# 
must be qualified with the rising edge of 
CLK2. 


This input indicates that the current bus cycle is 
complete. In the Master Mode, assertion of this sig- 
nal indicates the end of a DMA bus cycle. In the 
Slave Mode, the 82370 monitors this input and 
ADS# to detect a pipelined address cycle. This sig- 
nal should be tied directly to the READY# input of 
the 80376 host processor. 


2.2.8 NEXT ADDRESS REQUEST (NA#) 


This input is used to indicate to the 82370 in the 
Master Mode that the system is requesting address 


pipelining. 
When 
driven 
LOW by either 
memory 
or 
peripheral 
devices 
during 
Master 
Mode, 
it indicates 
that the system is prepared 
to accept 
a new address 
and bus cycle definition 
signals 
from the 82370 
be- 
fore the end of the current 
bus cycle. 
If this input is 
active when sampled 
by the 82370, the next address 


is driven 
onto 
the 
bus, provided 
a bus request 
is 
already 
pending 
internally. 


This input pin is monitored 
only in the Master 
Mode. 


In the Slave 
Mode, the 82370 
uses the AOS#. 
and 
REAOY# 
signals 
to 
determine 
address 
pipelining 


cycles, 
and NA# 
will be ignored. 


2.2.9 RESET 
(RESET, 
CPURST) 


RESET 


This 
synchronous 
input 
suspends 
any operation 
in 
progress 
and 
places 
the 
82370 
in a known 
initial 
state. 
Upon 
reset, 
the 
82370 
will 
be in the 
Slave 
Mode 
waiting 
to 
be 
initialized 
by the 
80376 
host 
processor. 
The 82370 
is reset by asserting 
RESET 
for 15 or more 
CLK2 
periods. 
When 
RESET 
is as- 
serted, 
all other input pins are ignored, 
and all other 
bus pins are driven to an idle bus state as shown 
in 
Table 2-3. The 82370 will determine 
the phase of its 
internal 
clock 
following 
RESET going inactive. 


RESET 
is level-sensitive 
and must be synchronous 


to the CLK2 signal. The RESET setup and hold time 
requirements 
are shown 
in Figure 2-3. 


Signal 
Level 


A1-A23, 
00-015, 
BHE#, 
BLE# 
Float 


O/C#, 
W/R#, 
M/lO#, 
AOS# 
Float 


REAOYO# 
'1' 


EOP# 
'1' (Weak Pull-UP) 


EOACK2-EOACKO 
'100' 


HOLD 
'0' 


INT 
UNDEFINED' 


TOUT1 /REF #, 
UNDEFINED' 


TOUTU/IRQ3#, 
TOUT3# 


CPURST 
'0' 


CHPSEL# 
'1' 


'NOTE: 
The Interrupt Controller and Programmable Interval Timer 
are initialized by software commands. 


This output 
signal 
is used to reset the 80376 
host 
processor. 
It will go active 
(HIGH) whenever 
one of 
the following 
events occurs: a) 82370's 
RESET input 
is active; 
b) a software 
RESET 
command 
is issued 
to the 82370; 
or c) when the 82370 
detects 
a proc- 


essor Shutdown 
cycle 
and when this detection 
fea- 
ture is enabled 
(see CPU Reset and Shutdown 
De- 
tect). 
When 
activated, 
CPURST 
will be held active 
for 62 clocks. The timing of CPURST is such that the 
80376 
processor 
will be in synchronization 
with the 
82370. 
This timing is shown 
in Figure 2-4. 
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Figure 2·3. RESET Timing 
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2.2.10 INTERRUPT 
OUT (INT) 


This output pin is used to signal the 80376 host 
processor that one or more interrupt requests (either 
internal or external) are pending. The processor is 
expected to respond with an Interrupt Acknowledge 
cycle. This signal should be connected directly to 
the Maskable Interrupt Request (INTR) input of the 
80376 host processor. 


2.3 82370 Bus Timing 


The 82370 internally divides the CLK2 signal by two 
to generate its internal clock. Figure 2-2 showed the 
relationship of CLK2 and the internal clock which 
consists of two phases: PHI1 and PHI2. Each CLK2 
period is a phase of the internal clock. 


In the 82370, whether. it is in the Master or Slave 
Mode, the shortest time unit of bus activity is a bus 
state. A bus state, which 
IS also referred as a 
'T-state', is defined as one 82370 PHI2 clock period 
(Le. two CLK2 periods). Recall in Table 2-2 various 
types of bus cycles in the 82370 are defined by the 
M/IO#, 
D/C# 
and W/R# 
signals. Each of these 
bus cycles is composed of two or more bus states. 
The length of a bus cycle depends on when the 
READY# input is asserted (Le. driven LOW). 


The 82370 supports Address Pipelining as an option 
in both the Master and Slave Mode. This feature typ- 
ically allows a memory or peripheral device to oper- 
ate with one less wait state than would otherwise be 
required. This is possible because during a pipelined 
CYCle,the address and bus cycle definition of the 
nex1cycle will be generated by the bus master while 
waiting for the end of the current cycle to be ac- 
knowledged. The pipelined bus is especially well 
suited for an interleaved memory environment. For 
16 MHz interleaved memory designs with 100 ns ac- 
cess time DRAMs, zero wait state memory accesses 
can be achieved when pipelined addressing is se- 
lected. 


In the Master Mode, the 82370 is capable of initiat- 
ing, on a cycle-by-cycle basis, either a pipelined or 
non-pipelined access depending upon the state of 
the NA# input. If a pipelined cycle is requested (indi- 
cated by NA# being driven LOW), the 82370 will 
drive the address and bus cycle definition of the nex1 
cycle as soon as there is an internal bus request 
pending. 


In the Slave Mode, the 82370 is constantly monitor- 
ing the ADS# and READY# signals on the proces- 
sor local bus to determine if the current bus cycle is 


a pipelined cycle. If a pipelined cycle is detected, the 
82370 will request one less wait state from the proc- 
essor if the Wait State Generator feature is selected. 
On the other hand, during an 82370 internal register 
access in a pipelined cycle, it will make use of the 
advance address and bus cycle information. In all 
cases, Address Pipelining will result in a savings of 
one wait state. 


2.3.2 MASTER 
MODE 
BUS TIMING 


When the 82370 is in the Master Mode, it will be in 
one of six bus states. Figure 2-5 shows the complete 
bus state diagram of the Master Mode, including 
pipelined address states. As seen in the figure, the 
82370 state diagram is very similar to that of the 
80376. The major difference is that in the 82370, 
there is no Hold state. Also, in the 82370, the condi- 
tions for some state transitions depend upon wheth- 
er it is the end of a DMA process. 


NOTE: 
The term 'end of a DMA process' is loosely defined 
here. It depends on the DMA modes of operation 
as well as the state of the EOP# and DREQ in- 
puts. This is expained in detail in section 3-DMA 
Controller. 


The 82370 will enter the idle state, Ti, upon RESET 
and whenever the internal address is not available at 
the end of a DMA cycle or at the end of a DMA 
process. When address pipelining is not used (NA# 
is not asserted), a new bus cycle always begins with 
state T1. During T1, address and bus cycle definition 
signals will be driven on the bus. T1 is always fol- 
lowed by T2. 


If a bus cycle is not acknowledged (with READY#) 
during T2 and NA# is negated, T2 will be repeated. 
When the end of the bus cycle is acknowledged dur- 
ing T2, the following state will be T1 of the nex1bus 
cycle (if the internal address latch is loaded and if 
this is not the end of the DMA process). Otherwise, 
the Ti state will be entered. Therefore, if the memory 
or peripheral accessed is fast enough to respond 
within the first T2, the fastest non-pipelined cycle will 
take one T1 and one T2 state. 


Use of the address pipelining feature allows the 
82370 to enter three additional bus states: T1P, T2P 
and T2L T1P is the first bus state of a pipelined bus 
cycle. T2P follows T1P (or T2) if NA# is asserted 
when sampled. The 82370 will drive the bus with the 
address and bus cycle definition signals of the nex1 
cycle during T2P. From the state diagram, it can be 
seen that after an idle state Ti, the first bus cycle 
must begin with T1, and is therefore a non-pipelined 
bus cycle. The nex1 bus cycle can be pipelined if 
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NA# is asserted and the previous bus cycle ended 
in a T2P state. Once the 82370 is in a pipelined 
cycle and provided that NA# is asserted in subse- 
quent cycles, the 82370 will be switching between 
T1P and T2P states. If the end of the current bus 
cycle is not acknowledged by the READY# input, 
the 82370 will extend the cycle by adding T2P 
states. The fastest pipelined cycle will consist of one 
T1P and one T2P state. 


The 82370 will enter state T2i when NA# is assert- 
ed and when one of the following two conditions 
occurs. The first condition is when the 82370 is in 
state T2. T2i will be entered if READY# is not as- 
serted and there is no next address available. This 
situation is similar to a wait state. The 82370 will stay 
in T2i for as long as this condition exists. The sec- 
ond condition which will cause the 82370 to enter 
T2i is when the 82370 is in state T1P. Before going 
to state T2P, the 82370 needs to wait in state T2i 
until the next address is available. Also, in both cas- 
es, if the DMA process is complete, the 82370 will 
enter the T2i state in order to finish the current DMA 
cycle. 


Figure 2-6 is a timing diagram showing non-pipelined 
bus accesses in the Master Mode. Figure 2-7 shows 
the timing of pipelined accesses in the Master Mode. 


Figure 2-8 shows the Slave Mode bus timing in both 
pipelined and non-pipelined cycles when the 82370 
is being accessed. Recall that during Slave Mode, 
the 82370 will constantly monitor the ADS# and 
READY# signals to determine if the next cycle is 
pipelined. In Figure 2-8, the first cycle is non-pipe- 
lined and the second cycle is pipelined. In the pipe- 
lined cycle, the 82370 will start decoding the ad- 
dress and bus cycle signals one bus state earlier 
than in a non-pipelined cycle. 


The READY# input signal is sampled by the 80376 
host processor to determine the completion of a bus 
cycle. This occurs during the end of every T2, T2i 
and T2P state. Normally, the output of the 82370 
Wait State Generator, READYO#, is directly con- 
nected to the READY# input of the 80376 host 
processor and the 82370. In such case, READYO# 
and READY# will be identical (see Wait State Gen- 
erator). 


NOTE: 
ADAV-Internal 
Address Available 
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Mode. 
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it will take one or more wait states 
in pipelined 
and two or more wait states 
in 
non·pipelined 
cycle 
to complete 
the internal 
access. 


Figure 2-8. Slave Read/Write Timing 


The 82370 has eight channels of DMA. Each chan- 
nel operates independently of the others. Within the 
operation of the individual channels, there are many 
different modes of data transfer available. Many of 
the operating modes can be intermixed to provide a 
very versatile DMA controller. 


The 82370 DMA Controller is capable of transferring 
data between any combination of memory and/or 
I/O, with any combination of data path widths. The 
82370 DMA Controller can be programmed to ac- 
commodate 8- or 16-bit devices. With its 16-bit ex- 
ternal data path, it can transfer data in units of byte 
or a word. Bus bandwidth is optimized through the 
use of an internal temporary register which can dis- 
assemble or assemble data to or from either an 
aligned or non-aligned destination or source. Figure 
3-1 is a block diagram of the 82370 DMA Controller. 


3.1 
Functional 
Description 


In describing the operation of the 82370's DMA Con- 
troller, close attention to terminology is required. Be- 
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HOLD 
CONTROL/STATUS 
REGISTERS 
CHANNEL 
REGISTERS 
HLDA 
COMMAND 
REGISTER I 
BASE 
CURRENT 
TEMPORARY 
COMMAND 
REGISTER n 
BYTE COUNT 
BYTE COUNT 
REGISTER 
DREQO 


DREQl 
MODE REGISTER I 
BASE 
CURRENT 
REQUESTER 
REQUESTER 
CHANNEL 
0 
DREQ2 
MODE REGISTER n 
ADDRESS 
ADDRESS 
DREQ3 
SOFlWARE 
REQUEST 
BASE 
CURRENT 
DREQ4 
REGISTER 
TARGET 
TARGET 
DREQS 
MASK 
REGISTER 
ADDRESS 
ADDRESS 
DREQ6 
STATUS 
REGISTER 
CHANNEL 
1 (SAME 
AS CH 0) 


DREQ7 
BUS SIZE REGISTER 
CHANNEL 
2 (SAME 
AS CH 0) 


CHAINING 
REGISTER 
CHANNEL 
3 (SAME 
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"LOWER" 
GROUP OF CHANNELS 


CONTROL/STATUS 
(SAME 
AS 
LOWER GROUP) 


"UPPER" 
GROUP OF CHANNELS 


CHANNEL 
4 
SAME AS CH 0) 


CHANNEL 
S 
SAME AS CH 0 


CHANNEL 
6 
SAME AS CH 0 


CHANNEL 
7 (SAME 
AS CH 0) 


fore entering the discussion of the function of the 
82370 DMA Controller, the following explanations of 
some of the terminology used herein may be of ben- 
efit. First, a few terms for clarification: 


DMA PROCES8-A 
DMA process is the execution 
of a programmed DMA task from beginning to end. 
Each DMA process requires intitial programming by 
the host 80376 microprocessor. 


BUFFER TRANSFER-The 
action required by the 
DMA to transfer an entire buffer. 


DATA TRANSFER-The 
DMA action in which a 
group of bytes or words are moved between devices 
by the DMA Controller. A data transfer operation 
may involve movement of one or many bytes. 


BUS CYCLE-Access 
by the DMA to a single byte 
or word. 


Each DMA channel consists of three major compo- 
nents. These components are identified by the con- 
tents of programmable registers which define the 


memory or I/O devices being serviced by the DMA. 
They are the Target, the Requester, and the Byte 
Count. They will be defined generically here and in 
greater detail in the DMA register definition section. 


The Requester is the device which requires service 
by the 82370 DMA Controller, and makes the re- 
quest for service. All of the control signals which the 
DMA monitors or generates for specific channels 
are logically related to the Requester. Only the Re- 
quester is considered capable of initiating or termi- 
nating a DMA process. 


The Target is the device with which the Requester 
wishes to communicate. As far as the DMA process 
is concerned, the Target is a slave which is incapa- 
ble of control over the process. 


The direction of data transfer can be either from Re- 
quester to Target or from Target to Requester; i.e. 
each can be either a source or a destination. 


The Requester and Target may each be either I/O 
or memory. Each has an address associated with it 
that can be incremented, decremented, or held con- 
stant. The addresses are stored in the Requester 
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Address Registers and Target Address Registers, 
respectively. These registers have two parts: one 
which contains the current address being used in the 
DMA process (Current Address Register), and one 
which holds the programmed base address (Base 
Address Register). The contents of the Base Regis- 
ters are never changed by the 82370 DMA Control- 
ler. The Current Registers are incremented or decre- 
mented according to the progress of the DMA pro- 
cess. 


The Byte Count is the component of the DMA pro- 
cess which dictates the amount of data which must 
be transferred. Current and Base Byte Count Regis- 
ters are provided. The Current Byte Count Register 
is decremented once for each byte transferred by 
the DMA process. When the register is decremented 
past zero, the Byte Count is considered 'expired' 
and the process is terminated or restarted, depend- 
ing on the mode of operation of the channel. The 
point at which the Byte Count expires is called 'Ter- 
minal Count' and several status signals are depen- 
dent on this event. 


Each channel of the 82370 DMA Controller also 
contains a 32-bit Temporary Register for use in as- 
sembling and disassembling non-aligned data. The 
operation of this register is transparent to the user, 
although the contents of it may affect the timing of 
some DMA handshake sequences. Since there is 
data storage available for each channel, the DMA 
Controller can be interrupted without loss of data. 


To avoid unexpected results, care should be taken 
in programming the byte count correctly when as- 
sembing and disassembling non-aligned data. For 
example: 


Words to Bytes: 
Transferring two words to bytes, but setting the byte 
count to three, will result in three bytes transferred 
and the final byte flushed. 


Bytes to Words: 
Transferring six bytes to three words, but setting the 
byte count to five, will result in the sixth byte trans- 
ferred being undefined. 


The 82370 DMA Controller is a slave on the bus until 
a request for DMA service is received via either a 
software request command or a hardware request 
signal. The host processor may access any of the 
control/status or channel registers at any time the 
82370 is a bus slave. Figure 3-2 shows the flow of 
operations that the DMA Controller performs. 


At the time a DMA service request is received, the 
DMA .Controller issues a bus hold request to the 
host processor. The 82370 becomes the bus master 
when the host relinquishes the bus by asserting a 


hold acknowledge signal. The channel to be serv- 
iced will be the one with the highest priority at the 
time the DMA Controller becomes the bus master. 
The DMA Controller will remain in control of the bus 
until the hold acknowledge signal is removed, or un- 
til the current DMA transfer is complete. 


While the 82370 DMA Controller has control of the 
bus, it will perform the required data transfer(s). The 
type of transfer, source and destination addresses, 
and amount of data to transfer are programmed in 
the control registers of the DMA channel which re- 
ceived the request for service. 


At completion of the DMA process, the 82370 will 
remove the bus hold request. At this time the 82370 
becomes a slave again, and the host returns to be- 
ing a master. If there are other DMA channels with 
requests pending, the controller will again assert the 
hold request signal and restart the bus arbitration 
and switching process. 


There are fourteen control signals dedicated to the 
DMA process. They include eight DMA Channel Re- 
quests (DREQn), three Encoded DMA Acknowledge 
signals (EDACKn), Processor Hold and Hold Ac- 
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W/R" 
MilO" 
D/C/I} 
BUS CONTROL 
" 
" 
SIGNALS 


EDACK 


EOP/I 


knowledge 
(HOLD, HLDA), and 
End-of-Process 
(EOP#). The DREQn inputs and EDACK (0-2) out- 
puts are handshake signals to the devices requiring 
DMA service. The HOLD output and HLDA input are 
handshake signals to the host processor. Figure 3-3 
shows these signals and how they interconnect be- 
tween the 82370 DMA Controller, and the Requester 
and Target devices. 


3.2.1 DREQn 
and EDACK (0-2) 


These signals are the handshake signals between 
the peripheral and the 82370. When the peripheral 
requires DMA service, it asserts the DREQn signal 
of the channel which is programmed to perform the 
service. The 82370 arbitrates the DREQn against 
other pending requests and begins the DMA pro- 
cess after finishing other higher priority processes. 


When the DMA service for the requested channel is 
in progress, the EDACK (0-2) signals represent the 
DMA channel which is accessing the Requester. 
The 3-bit code on the EDACK (0-2) lines indicates 
the number of the channel presently being serviced. 
Table 3-2 shows the encoding of these signals. Note 
that Channel 4 does not have a corresponding hard- 
ware acknowledge. 


The DMA acknowledge (EDACK) signals indicate 
the active channel only during DMA accesses to the 
Requester. During accesses to the Target, EDACK 
(0-2) 
has the idle code (100). EDACK (0-2) 
can 
thus be used to select a Requester device during a 
transfer. 


DREQn can be programmed as either an Asynchro- 
nous or Synchronous input. See section 3.4.1 for de- 
tails on synchronous versus asynchronous operation 
of these pins. 


Table 3·2. EDACK Encoding 
During a DMA Transfer 


EDACK2 
EDACK1 
EDACKO 
Active Channel 


0 
0 
0 
0 
0 
0 
1 
1 
0 
1 
0 
2 
0 
1 
1 
3 
1 
0 
0 
Target Access 
1 
0 
1 
5 
1 
1 
0 
6 
1 
1 
1 
7 


The EDACKn signals are always active. They either 
indicate 'no acknowledge' or they indicate a bus ac- 
cess to the requester. The acknowledge code is ei- 
ther 100, for an idle DMA or during a DMA access to 
the Target, or 'n' during a Requester access, where 
n is the binary value representing the channel. A 
simple 3-line to 8-line decoder can be used to pro- 
vide discrete acknowledge signals for the peripher- 
als. 


3.2.2 HOLD 
AND HLDA 


The Hold Request (HOLD) and Hold Acknowledge 
(HLDA) signals are the handshake signals between 
the DMA Controller and the host processor. HOLD is 
an output from the 82370 and HLDA is an input. 
HOLD is asserted by the DMA Controller when there 
is a pending DMA request, thus requesting the proc- 
essor to give up control of the bus so the DMA pro- 
cess can take place. The 80376 responds by assert- 
ing HLDA when it is ready to relinquish control of the 
bus. 
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The 82370 will begin operations on the bus one 
clock cycle after the HLDA signal goes active. For 
this reason, other devices on the bus should be in 
the slave mode when HLDA is active. 


HOLD and HLDA should not be used to gate or se- 
lect peripherals requesting DMA service. This is be- 
cause of the use of DMA-like operations by the 
DRAM Refresh Controller. The Refresh Controller is 
arbitrated with the DMA Controller for control of the 
bus, and refresh cycles have the highest priority. A 
refresh cycle will take place between DMA cycles 
without relinquishing bus control. See section 3.4.3 
for a more detailed discussion of the interaction be- 
tween the DMA Controller and the DRAM Refresh 
Controller. 


EOP# is a bi-directional signal used to indicate the 
end of a DMA process. The 82370 activates this as 
an output during the T2 states of the last Requester 
bus cycle for which a channel is programmed to exe- 
cute. The Requester should respond by either with- 
drawing its DMA request, or interrupting the host 
processor to indicate that the channel needs to be 
programmed with a new buffer. As an input, this sig- 
nal is used to tell the DMA Controller that the periph- 
eral being serviced does not require any more data 
to be transferred. This indicates that the current 
buffer is to be terminated. 


EOP# can be programmed as either an Asynchro- 
nous or a Synchronous input. See section 3.4.1 for 
details on synchronous versus asynchronous opera- 
tion of this pin. 


3.3 Modes of Operation 


The 82370 DMA Controller has many independent 
operating functions. When designing peripheral in- 
terfaces for the 82370 DMA Controller, all of the 
functions or modes must be considered. All of the 
channels are independent of each other (except in 
priority of operation) and can operate in any of the 
modes. Many of the operating modes, though inde- 
pendently programmable, affect the operation of 
other modes. Because of the large number of com- 
binations possible, each programmable mode is dis- 
cussed here with its affects on the operation of other 
modes. The entire list of possible combinations will 
not be presented. 


Table 3-1 shows the categories of DMA features 
available in the 82370. Each of the five major cate- 
gories is independent of the others. The sub-catego- 
ries are the available modes within the major func- 


Table 3-1. DMA Operating 
Modes 


I. TARGET/REQUESTER DEFINITION 
a. Data Transfer Direction 
b. Device Type 
II. BUFFER PROCESSES 
a. Single Buffer Process 
b. Buffer Auto-Initialize Process 
c. Buffer Chaining Process 
III. DATA TRANSFER/HANDSHAKE MODES 
a. Single Transfer Mode 
b. Demand Transfer Mode 
c. Block Transfer Mode 
d. Cascade Mode 
IV. PRIORITYARBITRATION 
a. Fixed 
b. Rotating 
c. Programmable Fixed 
V. BUS OPERATION 
a. Fly-By (Single-Cycle)/Two-Cycle 
b. Data Path Width 
c. Read, Write, or Verify Cycles 


tion or mode category. The following sections ex- 
plain each mode or function and its relation to other 
features. 


All DMA transfers involve three devices: the DMA 
Controller, the Requester, and the Target. Since the 
devices to be accessed by the DMA Controller vary 
widely, the operating characteristics of the DMA 
Controller must be tailored to the Requester and 
Target devices. 


The Requester can be defined as either the source 
or the destination of the data to be transferred. This 
is done by specifying a Write or a Read transfer, 
respectively. In a Read transfer, the Target is the 
data source and the Requester is the destination for 
the data. In a Write transfer, the Requester is the 
source and the Target is the destination. 


The Requester and Target addresses can each be 
independently programmed to be incremented, dec- 
remented, or held constant. As an example, the 
82370 is capable of reversing a string of data by 
having the Requester address increment and the 
Target address decrement in a memory-to-memory 
transfer. 


The 82370 DMA Controller allows three programma- 
ble Buffer Transfer Processes. These processes de- 
fine the logical way in which a buffer of data is ac- 
cessed by the DMA. 


The three Buffer Transfer Processes include the Sin- 
gle Buffer Process, the Buffer Auto-Initialize Pro- 
cess, and the Buffer Chaining Process. These pro- 
cesses require special programming considerations. 
See the DMA Programming section for more details 
on setting up the Buffer Transfer Processes. 


Single 
Buffer 
Process 


The Single Buffer Process allows the DMA channel 
to transfer only one buffer of data. When the buffer 
has 
been 
completely 
transferred 
(Current Byte 
Count decremented past zero or EOP# input ac- 
tive), the DMA process ends and the channel be- 
comes idle. In order for that channel to be used 
again, it must be reprogrammed. 


The Single Buffer Process is usually used when the 
amount of data to be transferred is known exactly, 
and it is also known that there is not likely to be any 
data to follow before the operating system can re- 
program the channel. 


The Buffer Auto-Initialize Process allows multiple 
groups of data to be transferred to or from a single 
buffer. This process does not require reprogram- 
ming. The Current Registers are automatically repro- 
grammed from the Base Registers when the current 
process is terminated, either by an expired Byte 
Count or by an external EOP# signal. The data 
transferred will always be between the same Target 
and Requester_ 


The auto-initialization/process-execution cycle is re- 
peated until the channel is either disabled or re-pro- 
grammed. 


The Buffer Chaining Process is useful for transfer- 
ring large quantities of data i'nto non-contiguous 
buffer areas. In this process, a single channel is 
used to process data from several buffers, while 
having to program the channel only once. Each new 
buffer is programmed in a pipelined operation that 
provides the new buffer information while the old 
buffer is being processed. The chain is created by 
loading new buffer information while the 82370 DMA 
Controller is processing the Current Buffer. When 
the Current Buffer expires, the 82370 DMA Control- 
ler automatically restarts the channel using the new 
buffer information. 


Loading the new buffer information is done by an 
interrupt routine which is requested by the 82370. 
Interrupt Request 1 (IRQ1) is tied internally to the 
82370 DMA Controller for this purpose. IRQ1 is gen- 
erated by the 82370 when the new buffer informa- 
tion is loaded into the channel's Current Registers, 
leaving the Base Registers 'empty'. The interrupt 
service routine loads new buffer information into the 
Base Registers. The host processor is required to 
load the information for another buffer before the 
current Byte Count expires. The process repeats un- 
til the host programs the channel back to single buff- 
er operation, or until the channel runs out of buffers. 


The channel runs out of buffers when the Current 
Buffer expires and the Base Registers have not yet 
been loaded with new buffer information. When this 
occurs, the channel must be reprogrammed. 


If an external EOP# is encountered while executing 
a Buffer Chaining Process, the current buffer is con- 
sidered expired and the new buffer information is 
loaded into the Current Registers. If the Base Regis- 
ters are 'empty', the chain is terminated. 


The channel uses the Base Target Address Register 
as an indicator of whether or not the Base Registers 
are full. When the most significant byte of the Base 
Target Register is loaded, the channel considers all 
of the Base Registers loaded, and removes the in- 
terrupt request. This requires that the other Base 
Registers (Base Requester Address, Base Byte 
Count) must be loaded before the Base Target Ad- 
dress Register. The reason for implementing the re- 
loading process this way is that, for most applica- 
tions, the Byte Count and the Requester will not 
change from one buffer to the next, and therefore do 
not need to be reprogrammed. The details of pro- 
gramming the channel for the Buffer Chaining Pro- 
cess can be found in the section on DMA program- 
ming. 


Three Data Transfer modes are available in the 
82370 DMA Controller. They are the Single Transfer, 
Block 
Transfer, 
and 
Demand 
Transfer 
Modes. 
These transfer modes can be used in conjunction 
with anyone of three Buffer Transfer modes: Single 
Buffer, Auto-Initialized Buffer and Buffer Chaining. 
Any Data Transfer Mode can be used under any of 
the Buffer Transfer Modes. These modes are inde- 
pendently available for all DMA channels. 


Different devices being serviced by the DMA Con- 
troller require different handshaking sequences for 
data transfers to take place. Three handshaking 
modes are available on the 82370, giving the de- 
signer the opportunity to use the DMA Controller as 
efficiently as possible. The speed at which data can 


be presented or read by a device can affect the way 
a DMA Controller uses the host's bus, thereby af- 
fecting not only data throughput during the DMA pro- 
cess, but also affecting the host's performance by 
limiting its access to the bus. 


HOLD-HLDA-DACK handshake cycle. Figure 3-5 
shows the timing of the Single Transfer Mode cycle. 


In the Single Transfer Mode, one data transfer to or 
from the RequeSltlr is performed by the DMA Con- 
troller at a time. The DREQn input is arbitrated and 
the HOLD/HLDA sequence is executed for each 
transfer. Transfers continue in this manner until the 
Byte Count expires, or until EOPII is sampled active. 
If the DREQn input is held active continuously, the 
entire DREQ-HOLD·HLDA-DACK sequence is re- 
peated over and over until the programmed number 
of bytes has been transferred. Bus control is re- 
leased to the host between each transfer. Figure 3-4 
shows the logical flow of events which make up a 
buffer transfer using the Single Transfer Mode. Re- 
fer to section 3.4 for an explanation of the bus con- 
trol arbitration procedure. 


EXECUTE 
ONE REQUESTER 
TRANSfER 


The Single Transfer Mode is used for devices which 
require complete handshake cycles with each data 
access. Data is transferred to or from the Requester 
only when the Requester is ready to perform the 
transfer. Each transfer requires the entire DREQ- 


Figure 3·4. Buffer Transfer 
In Single Transfer Mode 
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The Single Transfer Mode is more efficient (15%-20%) 
in the case where the source is the Target. Because of the 
internal pipeline of the 82370 DMA Controller, two idle states are added at the end of a transfer in the case where the 
source is the Requester. 
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In the Block Transfer 
Mode, the DMA process 
is ini- 
tiated by a DMA request and continues 
unti the Byte 
Count expires, 
or until EOP# 
is activated 
by the Re- 
quester. 
The DREQn signal need only be held active 
until the first Requester 
access. 
Only a refresh cycle 
will interrupt 
the block transfer 
process. 


Figure 3-6 illustrates 
the operation 
of the DMA dur- 
ing the Block Transfer 
Mode. 
Figure 3-7 shows 
the 


timing 
of the handshake 
signals 
during Block 
Mode 
Transfers. 
Figure 3-6. Buffer Transfer 
In Block Transfer 
Mode 
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The Demand Transfer Mode provides the most flex- 
ible handshaking procedures during the DMA pro- 
cess. A Demand Transfer is initiated by a DMA re- 
quest. The process continues until the Byte Count 
expires, or an external EOP# is encountered. If the 
device being serviced (Requester) desires, it can in- 
terrupt 
the 
DMA 
process 
by de-activating 
the 
DREQn line. Action is taken on the condition of 
DREQn during Requester accesses only. The ac- 
cess during which DREQn is sampled inactive is the 
last Requester access which will be performed dur- 
ing the current transfer. Figure 3-8 shows the flow of 
events during the transfer of a buffer in the Demand 
Mode. 


When the DREQn line goes inactive, the DMA Con- 
troller will complete the current transfer, including 
any necessary accesses to the Target, and relin- 
quish control of the bus to the host. The current pro- 
cess information is saved (byte count, Requester 
and Target addresses. and Temporary Register). 


Figure 3·8. Buffer Transfer 
In Demand 
Transfer 
Mode 


The Requester can restart the transfer process by 
reasserting DREQn. The 82370 will arbitrate the re- 
quest with other pending requests and begin the 
process where it left off. Figure 3-9 shows the timing 
of handshake signals during Demand Transfer Mode 
operation. 
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Using the Demand Transfer Mode allows peripherals 
to access memory in small, irregular bursts without 
wasting bus control time. The 82370 is designed to 
give the best possible bus control latency in the De- 
mand Transfer Mode. Bus control latency is defined 
here as the time form the last active bus cycle of the 
previous bus master to the first active bus cycle of 
the new bus master. The 82370 DMA Controller will 
perform its first bus access cycle two bus states af- 
ter HLDA goes active. In the typical configuration, 
bus control is returned to the host one bus state 
atter the DREQn goes inactive. 
' 


There are two cases where there may be more than 
one bus state of bus control latency at the end of a 
transfer. The first is at the end of an Auto-Initialize 
process, and the second is at the end of a process 
where the source is the Requester and Two-Cycle 
transfers are used. 


When a Buffer Auto-Initialize Porcess is complete, 
the 82370 requires seven bus states to reload the 
Current Registers from the Base Registers of the 
Auto-Initialized channel. The reloading is done while 
the 82370 is still the bus master so that it is prepared 
to service the channel immediately after relinquish- 
ing the bus, if necessary. 


CHANNEL 7 


CHANNEL 6 


CHANNEL 5 


CHANNEL 4 


PHANTOM 


In the case where the Requester is the source, and 
Two-Cycle transfers are being used, there are two 
extra idle states at the end of the transfer process. 
This occurs due to the housekeeping in the DMA's 
internal pipeline. These two idle states are present 
only after the very last Requester access, before the 
DMA Controller de-activates the HOLD signal. 


DMA channel priority can be programmed into one 
of two arbitration methods: Fixed or Rotating. The 
four lower DMA channels and the four upper DMA 
channels operate as if they were two separate DMA 
controllers operating in cascade. The lower group of 
four channels (0-3) 
is always prioritized between 
channels 7 and 4 of the upper group of channels (4- 
7). Figure 3-10 shows a pictorial representation of 
the priority grouping. 


The priority can thus be set up as rotating for one 
group of channels and fixed for the other, or any 
other combination. While in Fixed Priority, the pro- 
grammer can also specify which channel has the 
lowest priority. 
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The 82370 DMA Controller defaults to Fixed Priority. 
Channel 0 has the highest priority, then 1, 2, 3, 4, 5, 
6, 7. Channel 7 has the lowest priority. Any time the 
DMA Controller arbitrates DMA requests, the re- 
questing channel with the highest priority will be 
serviced next. 


Fixed Priority can be entered into at any time by a 
software command. The priority levels in effect after 
the mode switch are determined by the current set- 
ting of the Programmable Priority. 


Programmable Priority is available for fixing the prior- 
ity of the DMA channels within a group to levels oth- 
er than the default. Through a software command, 
the channel to have the lowest priority in a group 
can be specified. Each of the two groups of four 
channels can have the priority fixed in this way. The 
other channels in the group will follow the natural 
Fixed Priority sequence. This mode affects only the 
priority levels while operating with Fixed Priority. 


For example, if channel 2 is programmed to have the 
lowest priority in its group, channel 3 has the highest 
priority. In descending order, the other channels 
would have the following priority: (3,0,1,2),4,5,6,7 
(channel 2 lowest, channel 3 highest). If the upper 


group were programmed to have channel 5 as the 
lowest priority channel, the priority would be (again, 
highest to lowest): 6,7, (3,0,1,2), 4,5. Figure 3-11 
shows this example pictorially. The lower group is 
always prioritized as a fifth channel of the upper 
group (between channels 4 and 7). 


The DMA Controller will only accept Programmable 
Priority commands while the addressed group is op- 
erating in Fixed Priority. Switching from Fixed to Ro- 
tating Priority preserves the current priority levels. 
Switching from Rotating to Fixed Priority returns the 
priority levels to those which were last programmed 
by use of Programmable Priority. 


Rotating Priority allows the devices using DMA to 
share the system bus more evenly. An individual 
channel does not retain highest priority after being 
serviced, priority is passed to the next highest priori- 
ty channel in the group. The channel which was 
most recently serviced inherits the lowest priority. 
This rotation occurs each time a channel is serviced. 
Figure 3-12 shows the sequence of events as priori- 
ty is passed between channels. Note that the lower 
group rotates within the upper group, and that serv- 
icing a channel within the lower group causes rota- 
tion within the group as well as rotation of the upper 
group. 
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~ 
~ 
- 
default (highest to low- 
est) 
DREQ2 and DREQ6-process 
channel 2 


~ 
~- 
channel 2 drops to low- 
est priority within group. 
Lower group 
drops 
to 
lowest priority within up- 
per group. (Double Rota- 
tion) 
DREQ6 (still) and DREQ7-process 
channel 6 
o~ 
~ - channel 6 drops to low- 
est priority within group 
DREQ7 (still) and DREQG-process channel 7 


~ 
~ 
- 
channel 7 drops to low- 
est priority within group 
DREQO(still) and DREQ1-process 
channel 0 


~ 
~- 
channel 0 drops to low- 
est priority within group. 
(Double Rotation) 
DREQ1 (still)-process 
channel 1 


~ 
~ 
- 
channel 1 drops to low- 
est priority within group 


Figure 3·12. Rotating Channel Priority. 
Lower and upper groups are programmed 
for the Rotating Priority Mode. 


Since the DMA Controller operates as two four- 
channel controllers in cascade, the overall priority 
scheme of all eight channels can take on a variety of 
forms. There are four possible combinations of prior- 
ity modes between the two groups of channels: 
Fixed Priority only (default), Fixed Priority upper 
group/Rotating Priority lower group, Rotating Priority 
upper group/Fixed Priority lower group, and Rotating 
Priority only. Figure 3-13 illustrates the operation of 
the two combined priority methods. 


Case 1- 
0-3 Fixed Priority, 4-7 Rotating Priority 
High 
Low 


Default priority 
~ 
~ 


After servicing channel 2 
~ 
~ 


After servicing channel 6 El @EIill ~ 


After servicing channel 1 
~ 
~ 


Case2- 
0-3 Rotating Priority, 4-7 Fixed Priority 
High 
Low 


Default priority 
~ 
~ 


After servicing channel 2 
~ 
~ 


After servicing channel 6 
~ 
~ 


After servicing channel 1 
~ 
~ 


Data may be transferred by the DMA Controller us- 
ing two different bus cycle operations: Fly-By (one- 
cycle) and Two-Cycle. These bus handshake meth- 
ods are selectable independently for each channel 
through a command register. Device data path 
widths are independently programmable for both 
Target and Requester. Also selectable through soft- 
ware is the direction of data transfer. All of these 
parameters affect the operation of the 82370 on a 
bus-cycle by bus-cycle basis. 


3.3.6.1 Fly-By Transfers 


The Fly-By Transfer Mode is the fastest and most 
efficient way to use the 82370 DMA Controller to 
transfer data. In this method of transfer, the data is 
written to the destination device at the same time it 
is read from the source. Only one bus cycle is used 
to accomplish the transfer. 
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In the Fly-By Mode, the DMA acknowledge signal is 
used to select the Requester. The DMA Controller 
simultaneously places the address of the Target on 
the address bus. The state of M/IO# 
and W/R# 
during the Fly-By transfer cycle indicate the type of 
Target and whether the Target is being written to or 
read from. The Target's Bus Size is used as an in- 
crementer for the Byte Count. The Requester ad- 
dress registers are ignored during Fly-By transfers. 


Note that memory-to-memory transfers cannot be 
done using the Fly-By Mode. Only one memory of 
I/O address is generated by the DMA Controller at a 
time during Fly-By transfers. Only one of the devices 
being accessed can be selected by an address. 
Also, the Fly-By method of data transfer limits the 
hardware to accesses of devices with the same data 
bus width. The Temporary Registers are not affect- 
ed in the Fly-By Mode. 


Fly-By transfers also require that the data paths of 
the Target and Requester be directly connected. 
This requires that successive Fly-By access be to 
word boundaries, or that the Requester be capable 
of switching its connections to the data bus. 


3.3.6.2. Two-Cycle 
Transfers 


Two-Cycle transfers can also be performed by the 
82370 DMA Controller. These transfers require at 
least two bus cycles to execute. The data being 
transferred is read into the DMA Controller's Tempo- 
rary Register during the first bus cycle(s). The sec- 
ond bus cycle is used to write the data from the 
Temporary Register to the destination. 


If the addresses of the data being transferred are 
not word aligned, the 82370 will recognize the situa- 
tion and read and write the data in groups of bytes, 
placing them always at the proper destination. This 
process of collecting the desired bytes and putting 
them together is called "byte assembly". The re- 
verse process (reading from aligned locations and 
writing to non-aligned locations) is called "byte dis- 
assembly". 


The assembly/disassembly 
process takes place 
transparent to the software, but can only be done 
while using the Two-Cycle transfer method. The 
82370 will always perform the assembly/disassem- 
bly process as necessary for the current data trans- 
fer. Any data path widths for either the Requester or 
Target can be used in the Two-Cycle Mode. This is 
very convenient for interfacing existing 8- and 16-bit 
peripherals to the 80376's 16-bit bus. 


The 82370 DMA Controller always reads and write 
data within the word boundaries; Le. if a word to be 


read is crossing a word boundary, the DMA Control- 
ler will perform two read operations, each reading 
one byte, to read the 16-bit word into the Temporary 
Register. Also, the 82370 DMA Controller always at- 
tempts to fill the Temporary Register from the 
source before writing any data to the destination. If 
the process is terminated before the Temporary 
Register is filled (TC or EOP#), the 82370 will write 
the partial data to the destination. If a process is 
temporarily suspended (such as when DREQn is de- 
activated during a demand transfer), the contents of 
a partially filled Temporary Register will be stored 
within the 82370 until the process is restarted. 


For example, if the source is specified as an 8-bit 
device and the destination as a 32-bit device, there 
will be four reads as necessary from the 8-bit source 
to fill the Temporary Register. Then the 82370 will 
write the 32-bit contents to the destination in two 
cycles of 16-bit each. This cycle will repeat until the 
process is terminated or suspended. 


With Two-Cycle transfers, the devices that the 
82370 accesses can reside at any address within 
I/O or memory space. The device must be able to 
decode the byte-enables (BLE#, BHE#). Also, if the 
device cannot accept data in byte quantities, the 
programmer must take care not to allow the DMA 
Controller to access the device on any address oth- 
er than the device boundary. 


3.3.6.3 Data Path Width and Data Transfer 
Rate 
Considerations 


The number of bus cycles used to transfer a single 
"word" of data is affected by whether the Two-Cycle 
or the Fly-By (Single-Cycle) transfer method is used. 


The number of bus cycles used to transfer data di- 
rectly affects the data transfer rate. Inefficient use of 
bus cycles will decrease the effective data transfer 
rate that can be obtained. Generally, the data trans- 
fer rate is halved by using Two-Cycle transfers in- 
stead of Fly-By transfers. 


The choice of data path widths of both Target and 
Requester affects the data transfer rate also. During 
each bus cycle, the largest pieces of data possible 
should be transferred. 


The data path width of the devices to be accessed 
must be programmed into the DMA controller. The 
82370 defaults after reset to 8-bit-to-8-bit data trans- 
fers, but the Target and Requester can have differ- 
ent data path widths, independent of each other and 
independent of the other channels. Since this is a 
software programmable function, more discussion of 
the uses of this feature are found in the section on 
programming. 
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3.3.6.4 Read, Write and Verify Cycles 


Three different bus cycles types may be used in a 
data transfer. They are the Read, Write and Verify 
cycles. These cycle types dictate the way in which 
the 82370 operates on the data to be transferred. 


A Read Cycle transfers data from the Target to the 
Requester. A Write Cycle transfers data from the 
Requester to the target. In a Fly-By transfer, the ad- 
dress and bus status signals indicate the access 
(read of write) to the Target; the access to the Re- 
quester is assumed to be the opposite. 


The Verify Cycle is used to perform a data read only. 
No write access is indicated or assumed in a Verify 
Cycle. The Verify Cycle is useful for validating block 
fill operations. An external comparator must be pro- 
vided to do any comparisons on the data read. 


3.4 
Bus Arbitration 
and Handshaking 


Figure 3-14 shows the flow of events in the DMA 
request 
arbitration 
process. 
The 
arbitration 
se- 
quence starts when the Requester asserts a DREQn 
(or DMA service is requested by software). Figure 
3-15 shows the timing of the sequence of events 
following a DMA request. This sequence is executed 
for each channel that is activated. The DREQn sig- 
nal can be replaced by a software DMA channel reo 
quest with no change in the sequence. 


After the Requester asserts the service request, the 
82370 will request control of the bus via the HOLD 
signal. The 82370 will always assert the HOLD sig- 
nal one bus state after the service request is assert- 
ed. The 80376 responds by asserting the HLDA sig- 
nal, thus releasing control of the bus to the 82370 
DMA Controller. 


Priority of pending DMA service requests is arbitrat- 
ed during the first state after HLDA is asserted by 
the 80376. The next state will be the beginning of 
the first transfer access of the highest priority pro- 
cess. 


When the 82370 DMA Controller is finished with its 
current bus activity, it returns control of the bus to 
the host processor. This is done by driving the 
HOLD signal inactive. The 82370 does not drive any 
address or data bus signals after HOLD goes low. It 
enters the Slave Mode until another DMA process is 
requested. The processor acknowledges that it has 


regained control of the bus by forcing the HLDA sig- 
nal inactive. Note that the 82370's DMA Controller 
will not re-request control of the bus until the entire 
HOLD/HLDA handshake sequence is complete. 


Figure 3·14. Bus Arbitration 
and DMA Sequence 


The 82370 DMA Controller will terminate a current 
DMA process for one of three reasons: expired byte 
count, end-of-process command (EOP# activated) 
from a peripheral, or deactivated DMA request sig- 
nal. In each case, the controller will de-assert HOLD 
immediately after completing the data transfer in 
progress. These three methods of process termina- 
tion are illustrated in Figures 3-16, 3-19 and 3-18, 
respectively. 


An expired byte count indicates that the current pro- 
cess is complete as programmed and the channel 
has no further transfers to process. The channel 
must be restarted according to the currently pro- 
grammed Buffer Transfer Mode, or reprogrammed 
completely, including a new Buffer Transfer Mode. 
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Channel priority resolution takes place during the bus state before HOLDA is asserted, allowing the DMA Controller to 
respond to HLDA without extra idle bus states. 


If the peripheral activates the EOP# signal, it is indi- 
cating that it will not accept or deliver any more data 
for the current buffer. The 82370 DMA Controller 
considers this as a completion of the channel's cur- 
rent process and interprets the condition the same 
way as if the byte count expired. 


The action taken by the 82370 DMA Controller in 
response to a de-activated DREQn signal depends 
on the Data Transfer Mode of the channel. In the 
Demand Mode, data transfers will take place as long 
as the DREQn is active and the byte count has not 
expired. In the Block Mode, the controller will com- 
plete the entire block transfer without relinquishing 
the bus, even if DREQn goes inactive before the 


transfer is complete. In the Single Mode, the control- 
ler will execute single data transfers, relinquishing 
the bus between each transfer, as long as DREQn is 
active. 


Normal termination of a DMA process due to expira- 
tion of the byte count (Terminal Count- TC) is 
shown if Figure 3-16. The condition of DREQn is 
ignored until after the process is terminated. If the 
channel is programmed to auto-initialize, HOLD will 
be held active for an additional seven clock cycles 
while the auto-initialization takes place. 


Table 3-3 shows the DMA channel activity due to 
EOP# or Byte Count expiring (Terminal Count). 


Single or 
Auto- 
Chaining-Base 
Buffer Process 
Chaining-Base 
Initialize 
Loaded 
Empty 


EVENT 


Terminal Count 
True 
X 
True 
X 
True 
X 
EOP# 
X 
0 
X 
0 
X 
0 


RESULTS 


Current Registers 
Load 
Load 
Load 
Load 
Channel Mask 
Set 
Set 
EOP# Output 
0 
X 
0 
X 
1 
X 
Terminal Count Status 
Set 
Set 
Set 
Set 
Software Request 
CLR 
CLR 
CLR 
CLR 
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The 82370 always relinquishes control of the bus 
between channel services. This allows the hardware 
designer the flexibility to externally arbitrate bus hold 
requests, if desired. If another DMA request is pend- 
ing when a higher priority channel service is com- 
pleted, the 82370 will relinquish the bus until the 
hold acknowledge is inactive. One bus state after 
the HLDA signal goes inactive, the 82370 will assert 
HOLD again. This is illustrated in Figure 3-17. 


3.4.1 SYNCHRONOUS 
AND ASYNCHRONOUS 
SAMPLING 
OF DREQn 
AND EOP# 


As an indicator that a DMA service is to be started, 
DREQn is always sampled asynchronous. It is sam- 


pled at the beginning of a bus state and acted upon 
at the end of the state. Figure 3-15 illustrates the 
start of a DMA process due to a DREQn input. 


The DREQn and EOP# inputs can be programmed 
to be sampled either synchronously or asynchro- 
nously to signal the end of a transfer. 


The synchronous mode affords the Requester one 
bus state of extra time to react to an access. This 
means the Requester can terminate a process on 
the current access, without losing any data. The 
asynchronous mode requires that the input signal be 
presented prior to the beginning of the last state of 
the Requester access. 
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The timing relationships of the DREQn and EOP# 
signals to the termination of a DMA transfer are 
shown in Figures 3-18 and 3-19. Figure 3-18 shows 
the termination of a DMA transfer due to inactive 
DREQn. Figure 3-19 shows the termination of a 
DMA process due to an active EOP# input. 


In the Synchronous Mode, DREQn and EOP# are 
sampled at the end of the last state of every Re- 
quester data transfer cycle. If EOP# is active or 
DREQn is inactive at this time, the 82370 recognizes 
this access to the Requester as the last transfer. At 
this point, the 82370 completes the transfer in prog- 
ress, if necessary, and returns bus control to the 
host. 
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DREOn 
(SYNCHRONOUS) 


HOLD 


In the asynchronous mode, the inputs are sampled 
at the beginning of every state of a Requester ac- 
cess. The 82370 waits until the end of the state to 
act on the input. 


DREQn and EOP# are sampled at the latest possi- 
ble time when the 82370 can determine if another 
transfer 
is required. In the Synchronous Mode, 


DREQn and EOP# are sampled on the trailing edge 
of the last bus state before another data access cy- 
cle begins. The Asynchronous Mode requires that 
the signals be valid one clock cycle earlier. 
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While in the Pipeline Mode, if the NA# signal is sam- 
pled active during a transfer, the end of the state 
where NA# was sampled active is when the 82370 
decides whether to commit to another transfer. The 
device must de-assert DREQn or assert EOP# be- 
fore NA# is asserted, otherwise the 82370 will com- 
mit to another, possibly undesired, transfer. 


Synchronous DREQn and EOP# sampling allows 
the peripheral to prevent the next transfer from oc- 
curring by de-activating DREQn or asserting EOP# 
during the current Requester access, before the 
82370 DMA Controller commits itself to another 
transfer. The DMA Controller will not perform the 
next transfer if it has not already begun the bus cy- 
cle. Asynchronous sampling allows less stringent 
timing requirements than the Synchronous Mode, 
but requires that the DREQn signal be valid at the 
beginning of the next to last bus state of the current 
Requester access. 


Using the Asynchronous Mode with zero wait states 
can be very difficult. Since the addresses and con- 
trol signals are driven by the 82370 near half-way 
through the first bus state of a transfer, and the 
Asynchronous Mode requires that DREQn be inac- 
tive before the end of the state, the peripheral being 
accessed is required to present DREQn only a few 
nanoseconds after the control information is avail- 
able. This means that the peripheral's control logic 
must be extremely fast (practically non-causal). An 
alternative is the Synchronous Mode. 


3.4.2 ARBITRATION 
OF CASCADED 
MASTER 
REQUESTS 


The Cascade Mode allows another DMA-type de- 
vice to share the bus by arbitrating its bus accesses 
with the 82370's. Seven of the eight DMA channels 
(0-3 and 5-7) can be connected to a cascaded de- 
vice. The cascaded device requests bus control 
through the DREQn line of the channel which is pro- 
grammed to operate in Cascade Mode. Bus hold ac- 
knowledge is signalled to the cascaded device 
through the EDACK lines. When the EDACK lines 
are active with the code for the requested cascade 
channel, the bus is available to the cascaded master 
device. 


A cascade cycle begins the same way a regular 
DMA cycle begins. The requesting bus master as- 
serts the DREQn line on the 82370. This bus control 
request is arbitrated as any other DMA request 
would be. If any channel receives a DMA request, 
the 82370 requests control of the bus. When the 
host acknowledges that it has released bus control, 
the 82370 acknowledges to the requesting master 
that it may access the bus. The 82370 enters an idle 
state until the new master relinquishes control. 


A cascade cycle will be terminated by one of two 
events: DREQn going inactive, or HLDA going inac- 
tive. The normal way to terminate the cascade cycle 
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is for the cascaded master to drop the DREQn sig- 
nal. Figure 3-21 shows the two cascade cycle termi- 
nation sequences. 


The Refresh Controller may interrupt the cascaded 
master to perform a refresh cycle. If this occurs, the 
82370 DMA Controller will de-assert the EDACK sig- 
nal (hold acknowledge to cascaded master) and wait 
for the cascaded master to remove its hold request. 
When the 82370 regains bus control, it will perform 
the refresh cycle in its normal fashion. After the re- 
fresh cycle has been completed, and if the cascad- 
ed device has re-asserted its request, the 82370 will 
return control to the cascaded master which was in- 
terrupted. 


The 82370 assumes that it is the only device moni- 
toring the 
HLDA signal. If the system designer 
wishes to place other devices on the bus as bus 
masters, the HLDA from the processor must be in- 
tercepted before presenting it to the 82370. Using 
the Cascade capabililty of the 82370 DMA Controller 
offers a much better solution. 


The arbitration of refresh requests by the DRAM Re- 
fresh Controller is slightly different from normal DMA 


channel request arbitration. The 82370 DRAM Re- 
fresh Controller always has the highest priority of 
any DMA process. It also can interrupt a process in 
progress. Two types of processes in progress may 
be encountered: normal DMA, and bus master cas- 
cade. 


In the event of a refresh request during a normal 
DMA process, the DMA Controller will complete the 
data transfer in progress and then execute the re- 
fresh cycle before continuing with the current DMA 
process. The priority of the interrupted process is 
not lost. If the data transfer cycle interrupted by the 
Refresh Controller is the last of a DMA process, the 
refresh cycle will always be executed before control 
of the bus is transferred back to the host. 


When the Refresh Controller request occurs during 
a cascade cycle, the Refresh Controller must be as- 
sured that the cascaded master device has relin- 
quished control of the bus before it can execute the 
refresh cycle. To do this, the DMA Controller drops 
the EDACK signal to the cascaded master and waits 
for the corresponding DREQn input to go inactive. 
By dropping the DREQn signal, the cascaded mas- 
ter relinquishes the bus. The Refresh Controller then 
performs the refresh cycle. Control of the bus is re- 
turned to the cascaded master if DREQn returns to 
an active state before the end of the refresh cycle, 
otherwise control is passed to the processor and the 
cascaded master loses its priority. 


Figure 3·21. Cascade Cycle Termination 
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3.5 
DMA Controller 
Register Overview 


The 82370 DMA Controller contains 44 registers 
which are accessable to the host processor. Twen- 
ty-four of these registers contain the device ad- 
dresses and data counts for the individual DMA 
channels (three per channel). The remaining regis- 
ters are control and status registers for initiating and 
monitoring the operation of the 82370 DMA Control- 
ler. Table 3-4 lists the DMA Controller's registers 
and their accessability. 


Table 3·4. DMA Controller Registers 


Register Name 
Access 


Control/Status Registers-one 
each per group 
Command Register I 
write only 
Command Register II 
write only 
Mode Register I 
write only 
Mode Register II 
write only 
Software Request Register 
read/write 
Mask Set-Reset Register 
write Qnly 
Mask Read-Write Register 
read/write 
Status Register 
read only 
Bus Size Register 
write only 
Chaining Register 
read/Write 
Channel Reglsters-one 
each per channel 
Base Target Address 
write only 
Current Target Address 
read only 
Base Requester Address 
write only 
Current Requester Address 
read only 
Base Byte Count 
" 
write only 
Current Byte Count 
read only 


The following registers are available to the host 
processor for programming the 82370 DMA Control- 
ler into its various modes and for checking the oper- 
ating status of the DMA processes. Each set of four 
DMA channels has one of each of these registers 
associated with it. 


Enables or disables the DMA channel as a group. 
Sets the Priority Mode (Fixed or Rotating) of the 
group. This write-only register is cleared by a hard- 
ware reset, defaulting to all channels enabled and 
Fixed Priority Mode. 


Command Register II 


Sets the sampling mode of the DREQn and EOP# 
inputs. Also sets the lowest priority channel for the 
group in the Fixed Priority Mode. The functions pro- 
grammed through Command Register II default after 
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a hardware reset to: asynchronous DREQn and 
EOP#, and channels 3 and 7 lowest priority. 


Mode Register I is identical in function to the Mode 
register of the 8237A. It programs the following func- 
tions for an individually selected channel: 
Type of Transfer-read, 
write, verify 
Auto-Initialize-enable 
or disable 
Target Address Count-increment 
or decrement 
Data Transfer Mode-<:lemand, single, block, 
cascade 


Mode Register I functions default to the following 
after reset: verify transfer, Auto-Initialize disabled, In- 
crement Target address, Demand Mode. 


Programs the following functions for an individually 
selected channel: 


Target Address Hold-enable 
or disable 
Requester Address Count-increment 
or 
decrement 
Requester Address Hold-enable 
or disable 
Target Device Type-I/O 
or Memory 
Requester Device Type-I/O 
or Memory 
Transfer Cycles- Two-Cycle or Fly-By 


Mode Register II functions are defined as follows 
after a hardware reset: Disable Target Address Hold, 
Increment Requester Address, Target (and Re- 
quester) in memory, Fly-By Transfer Cycles. Note: 
Requester Device Type ignored in Fly-By Transfers. 


The DMA Controller can respond to service requests 
which are initiated by software. Each channel has an 
internal request status bit associated with it. The 
host processor can write to this register to set or 
reset the request bit of a selected channel. 


The status of a group's software DMA service re- 
quests can be read from this register as well. ·Each 
status bit is cleared upon Terminal Count or external 
EOP#. 


The software DMA requests are non-maskable and 
subject to priority arbitration with all other software 
and 
hardware 
requests. 
The 
entire 
register 
is 
cleared by a hardware reset. 


Each channel has associated with it a mask bit 
which can be set/reset to disable/enable that chan- 
nel. Two methods are available for setting and clear- 
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ing the mask bits. The Mask Set/Reset Register is a 
write-only register which allows the host to select an 
individual channel and either set or reset the mask 
bit for that channel only. The Mask Read/Write Reg- 
ister is available for reading the mask bit status and 
for writing mask bits in groups of four. 


The mask bits of a group may be cleared in one step 
by executing the Clear Mask Command. See the 
DMA Programming section for details. A hardware 
reset sets all of the channel mask bits, disabling all 
channels. 


Status 
Register 


The Status register is a read-only register which con- 
tains the Terminal Count (TC) and Service Request 
status for a group. Four bits indicate the TC status 
and four bits indicate the hardware request status 
for the four channels in the group. The TC bits are 
set when the Byte Count expires, or when and exter- 
nal EOP# is asserted. These bits are cleared by 
reading from the Status Register. The Service .Re- 
quest bit for a channel indicates when there 
IS a 
hardware DMA request (DREQn) asserted for that 
channel. When the request has been removed, the 
bit is cleared. 


Bus Size Register 


This write-only register is used to define the bus size 
of the Target and Requester of a selected channel. 
The bus sizes programmed will be used to dictate 
the sizes of the data paths accessed when the DMA 
channel is active. The values programmed into this 
register affect the operation of the Temporary Regis- 
ter. When 32-bit bus width is programmed, the 
82370 DMA Controller will access the device twice 
through its 16-bit external Data Bus to perform a 
32-bit data transfer. Any byte-assembly required to 
make the transfers using the specified data path 
widths will be done in the Temporary Register. The 
Bus Size register of the Target is used as an incre- 
ment! decrement value for the Byte Counter and 
Target Address when in the Fly-By Mode. Upon re- 
set, all channels default to 8-bit Targets and 8-bit 
Requesters. 


Chaining 
Register 


As a command or write register, the Chaining regis- 
ter is used to enable or disable the Chaining Mode 
for a selected channel. Chaining can either be dis- 
abled or enabled for an individual channel, indepen- 
dently of the Chaining Mode status of other chan- 
nels. After a hardware reset, all channels default to 
Chaining disabled. 


When read by the host, the Chaining Register pro- 
vides the status of the Chaining Interrupt of each of 


the channels. These interrupt status bits are cleared 
when the new buffer information has been loaded. 


3.5.2 CHANNEL 
REGISTERS 


Each channel has three individually programmable 
registers necessary for the DMA process; they are 
the Base Byte Count, Base Target Address, and 
Base Requester Address registers. The 24-bit Base 
Byte Count register contains the number of bytes to 
be transferred by the channel. The 24-bit Base Tar- 
get Address Register contains the beginning ad- 
dress (memory or I/O) of the Target device. The 
24-bit Base Requester Address register contains the 
base address (memory or I/O) of the device which is 
to request DMA service. 


Three more registers for each DMA channel exist 
within the DMA Controller which are directly related 
to the registers mentioned above. These registers 
contain the current status of the DMA process. They 
are the Current Byte Count register, the Current Tar- 
get Address, and the Current Requester Address. It 
is these registers which are manipulated (increment- 
ed, decremented, or held constant) by the 82370 
DMA Controller during the DMA process. The Cur- 
rent registers are loaded from the Base registers at 
the beginning of a DMA process. 


The Base registers are loaded when the host proc- 
essor writes to the respective channel register ad- 
dresses. Depending on the mode in which the chan- 
nel is operating, the Current registers are typically 
loaded in the same operation. Reading from the 
channel register addresses yields the contents of 
the corresponding Current register. 


To maintain compatibility with software which ac- 
cesses an 8237A, a Byte Pointer Flip-Flop is used to 
control access to the upper and lower bytes of some 
words of the Channel Registers. These words are 
accessed as byte pairs at single port addresses. The 
Byte Pointer Flip-Flop acts as a one-bit pointer 
which is toggled each time a qualifying Channel 
Register byte is accessed. 


It always points to the next logical byte to be ac- 
cessed of a pair of bytes. 


The Channel registers are arranged as pairs of 
words, each pair with its own port address. Address- 
ing the port with the Byte Pointer Flip-Flop re~et ac- 
cesses the least significant byte of the pair. The 
most significant byte is accessed when the Byte 
Pointer is set. 


For compatibility with existing 8237A designs, there 
is one exception to the above statements about the 
Byte Pointer Flip-Flop. The third byte (bits 16-23) of 
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the Target Address is accessed through its own port 
address. The Byte Pointer Flip-Flop is not affected 
by any accesses to this byte. 


The upper eight bits of the Byte Count Register are 
cleared when the least significant byte of the regis- 
ter is loaded. This provides compatibility with soft- 
ware which accesses an 8237A. The 8237A has 
16-bit Byte Count Registers. 


Each channel has a 32-bit Temporary Register used 
for temporary data storage during two-cycle DMA 
transfers. It is this register in which any necessary 
byte assembly and disassembly of non-aligned data 
is performed. Figure 3-22 shows how a block of data 
will be moved between memory locations with differ- 
ent boundaries. Note that the order of the data does 
not change. 


A 


B 


C 


D 


E 


F 


G 


A 


B 


C 


D 


E 


F 


G 


Target = source = 00000020H 
Requester 
= destination 
= 00000053H 
Byte Count 
= 000007H 


Figure 3-22. Transfer 
of data between 
memory 
locations 
with different 
boundaries. 
This will be 
the result, independent 
of data path width. 


If the destination is the Requester and an early pro- 
cess termination has been indicated by the EOP# 
signal or DREQn inactive in the Demand Mode, the 
Temporary Register is not affected. If data remains 
in the Temporary Register due to differences in data 
path widths of the Target and Requester, it will not 
be transferred or otherwise lost, but will be stored for 
later transfer. 


If the destination is the Target and the EOP# signal 
is sensed active during the Requester access of a 
transfer, the DMA Controller will complete the trans- 
fer by sending to the Target whatever information is 
in the Temporary Register at the time of process 


termination. This implies that the Target could be 
accessed with partial data in two accesses. For this 
reason it is advisable to have an I/O device desig- 
nated as a Requester, unless it is capable of han- 
dling partial data transfers. 


3.6 
DMA Controller 
Programming 


Programming a DMA Channel to perform a nee~ed 
DMA function is in general a four step process. First 
the global attributes of the DMA Controller are pro- 
grammed via the two Command Registers. These 
global attributes include: priority levels, chan~el 
group enables, priority mode, and DREQn/EOP# In- 
put sampling. 


The second step involves setting the operating 
modes of the particular channel. The Mode Regis- 
ters are used to define the type of transfer and the 
handshaking modes. The Bus Size Register and 
Chaining Register may also need to be programmed 
in this step. 


The third step in setting up the channel is to load the 
Base Registers in accordance with the needs of the 
operating modes chosen in step two. The Current 
Registers are automatically loaded from the Ba~e 
Registers, if required by the Buffer Transfer Mode ~n 
effect. The information loaded and the order In 
which it is loaded depends on the operating mode. A 
channel used for cascading, for example, needs no 
buffer information and this step can be skipped en- 
tirely. 


The last step is to enable the newly programmed 
channel using one of the Mask Registers. The chan- 
nel is then available to perform the desired data 
transfer. The status of the channel can be observed 
at any time through the Status Register, Mask Reg- 
ister, Chaining Register, and Software Request reg- 
ister. 


Once the channel is programmed and enabled, the 
DMA process may be initiated in one of two ways, 
either by a hardware DMA request (DREQn) or a 
software request (Software Request Register). 


Once programmed to a particular Process/Mode 
configuration, the channel will operate in that config- 
uration until programmed otherwise. For this reason, 
restarting a channel after the current buffer expires 
does not require complete reprogramming of the 
channel. 
Only 
those 
parameters 
which 
have 
changed need to be reprogrammed. The Byte Count 
Register is always changed and must be repro- 
grammed. A Target or Requester Address Register 
which is incremented or decremented should be re- 
programmed also. 
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The Buffer Process 
is determined 
by the Auto-Initial- 
ize bit of Mode Register 
I and the Chaining 
Register. 
If Auto-Initialize 
is enabled, 
Chaining 
should 
not be 
used. 


3.6.1.1 Single Buffer 
Process 


The 
Single 
Buffer 
Process 
is programmed 
by dis- 


abling 
Chaining 
via the Chaining 
Register 
and pro- 
gramming 
Mode Register 
I for non-Auto-Initialize. 


Setting the Auto-Initialize 
bit in Mode Register 
I is all 
that is necessary 
to place the channel 
in this mode. 


Buffer 
Auto-Initialize 
must 
not be enabled 
simulta- 
neous to enabling 
the Buffer Chaining 
Mode as this 
will have unpredictable 
results. 


Once the Base Registers 
are loaded, the channel 
is 
ready to be enabled. 
The channel 
will reload its Cur- 


rent 
Registers 
from 
the 
Base 
Registers 
each 
time 
the Current 
Buffer expires, 
either by an expired 
Byte 
Count or an external 
EOP#. 


3.6.1.3 Buffer 
Chaining 


The Buffer Chaining 
Process 
is entered 
into from the 
Single 
Buffer 
Process. 
The Mode 
Registers 
should 
be programmed 
first, with all of the Transfer 
Modes 
defined 
as if the channel 
were to operate 
in the Sin- 
gle Buffer 
Process. 
The channel's 
Base 
Registers 
are then loaded. When the channel 
has been set up 
in this way, and the chaining 
interrupt 
service 
routine 
is in place, the Chaining 
Process 
can be entered 
by 
programming 
the Chaining 
Register. 
Figure 
3-23 
il- 
lustrates 
the Buffer Chaining 
Process. 


An interrupt 
(IRQ1) will be generated 
immediately 
af- 
ter the Chaining 
Process 
is entered, 
as the channel 
then perceives 
the Base Registers 
as empty 
and in 
need of reloading. 
It is important 
to have the inter- 
rupt service 
routine in place at the time the Chaining 
Process 
is entered 
into. The interrupt 
request 
is re- 
moved 
when 
the most significant 
byte of the Base 
Target Address 
is loaded. 


The interrupt 
will occur 
again when 
the first 
buffer 
expires 
and the Current 
Registers 
are loaded 
from 
the 
Base 
Registers. 
The 
cycle 
continues 
until 
the 
Chaining 
Process 
is disabled, 
or the host fails to re- 
spond to IRQ1 before the Current 
Buffer expires. 


(lRQI 
WILL 
NEED SERVICE- 
LOAD 
BASE 
REGISTERS) 


fROM 
THIS 
POINT, 
THE HOST 
CAN 
PERfORM 
ANOTHER 
TASK. 
THE INTERRUPT 
SERVICE 
ROUTINE 
LEfT 
BEHIND 
WILL 
MAINTAIN 
THE CHANNEL. 
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Exiting the Chaining Process can be done by reset- 
ting the Chaining Mode Register. If an interrupt is 
pending for the channel when the Chaining Register 
is reset, the interrupt request will be removed. The 
Chaining Process can be temporarily disabled by 
setting the channel's Mask bit in the Mask Register. 


The interrupt service routine for IRQ1 has the re- 
sponsibility of reloading the Base Registers as nec- 
essary. It should check the status of the channel to 
determine the cause of channel expiration, etc. It 
should also have access to operating system infor- 
mation regarding the channel, if any exists. The 
IRQ1 service routine should be capable of determin- 
ing whether the chain should be continued or termi- 
nated and act on that information. 


The Data Transfer Modes are selected via Mode 
Register I. The Demand, Single, and Block Modes 
are selected by bits D6 and D7. The individual trans- 
fer type (Fly-By vs Two-Cycle, Read-Write-Verify, 
and I/O vs Memory) is programmed through both of 
the Mode registers. 


The Cascade Mode is set by writing ones to D7 and 
D6 of Mode Register I. When a channel is pro- 
grammed to operate in the Cascade Mode, all of the 
other modes associated with Mode Registers I and II 
are ignored. The priority and DREQn/EOP# 
defini- 
tions of the Command Registers will have the same 
effect on the channel's operation as any other 
mode. 


There are five port addresses which, when written 
to, command certain operations to be performed by 
the 82370 DMA Controller. The data written to these 
locations is not of consequence, writing to the loca- 


tion is all that is necessary to command the 82370 to 
perform the indicated function. Following are de- 
scriptions of the command functions. 


Resets the Byte Pointer Flip-Flop. This command 
should be performed at the beginning of any access 
to the channel registers in order to be assured of 
beginning at a predictable place in the register pro- 
gramming sequence. 


All DMA functions are set to their default states. This 
command is the equivalent of a hardware reset to 
the DMA Controller. Functions other than those in 
the DMA Controller section of the 82370 are not af- 
fected by this command. 
Clear Mask Register-Channels 
0-3 
- 
Location OOOEH 
Channels 4-7 
- 
Location OOCEH 


This command simultaneously clears the Mask Bits 
of all channels in the addressed group, enabling all 
of the channels in the group. 


This command resets the Terminal Count Interrupt 
Request Flip-Flop. It is provided to allow the pro- 
gram which made a software DMA request to ac- 
knowledge that it has responded to the expiration of 
the requested channel(s). 


3.7 
Register Definitions 


.The following diagrams outline the bit definitions and 
functions of the 82370 DMA Controller's Status and 
Control Registers. The function and programming of 
the registers is covered in the previous section on 
DMA Controller Programming. An entry of "X" as a 
bit value indicates "don't care." 
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Channel 
Register 
Name 
Address 
Byte 
Bits 
(hex) 
Pointer 
Accessed 


Channel 
0 
Target Address 
00 
0 
0-7 
1 
8-15 
87 
x 
16-23 
Byte Count 
01 
0 
0-7 
1 
8-15 
11 
0 
16-23 
Requester 
Address 
90 
0 
0-7 
1 
8-15 
91 
0 
16-23 


Channel 
1 
Target Address 
02 
0 
0-7 
1 
8-15 
83 
x 
16-23 
Byte Count 
03 
0 
0-7 
1 
8-15 
13 
0 
16-23 
Requester 
Address 
92 
0 
0-7 
1 
8-15 
93 
0 
16-23 


Channel 
2 
Target Address 
04 
0 
0-7 
1 
8-15 
81 
x 
16-23 
Byte Count 
05 
0 
0-7 
1 
8-15 
15 
0 
16-23 
Requester 
Address 
94 
0 
0-7 
1 
8-15 
95 
0 
16-23 


Channel 
3 
Target Address 
06 
0 
0-7 
1 
8-15 
82 
x 
16-23 
Byte Count 
07 
0 
0-7 
1 
8-15 
17 
0 
16-23 
Requester 
Address 
96 
0 
0-7 
1 
8-15 
97 
0 
16-23 


Channel 
4 
Target Address 
CO 
0 
0-7 
1 
8-15 
8F 
x 
16-23 
Byte Count 
C1 
0 
0-7 
1 
8-15 
D1 
0 
16-23 
Requester 
Address 
98 
0 
0-7 
1 
8-15 
99 
0 
16-23 


Channel 
Register 
Name 
Address 
Byte 
Bits 


(hex) 
Pointer 
Accessed 


Channel 5 
Target Address 
C2 
0 
0-7 


1 
8-15 


88 
x 
16-23 
8yte Count 
C3 
0 
0-7 


1 
8-15 
03 
0 
16-23 
Requester Address 
9A 
0 
0-7 


1 
8-15 


98 
0 
16-23 


Channel 6 
Target Address 
C4 
0 
0-7 


1 
8-15 
89 
x 
16-23 
8yte Count 
C5 
0 
0-7 
1 
8-15 


05 
0 
16-23 
Requester Address 
9C 
0 
0-7 
1 
8-15 


90 
0 
16-23 


Channel 7 
Target Address 
C6 
0 
0-7 
1 
8-15 
8A 
x 
16-23 
8yte Count 
C7 
0 
0-7 
1 
8-15 
07 
0 
16-23 
Requester Address 
9E 
0 
0-7 


1 
8-15 
9F 
0 
16-23 


Port Addresses- 
Channels 0-3-Q008H 


Channels 4-7-QOC8H 


GROUP MASK 
o = ENABLE CHANNELS 
, = DISABLE CHANNELS 


PRIORITY 
o = fiXED 
PRIORITY 


1 = ROTATING PRIORITY 
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Port Addresses- 
Channels 0-3-Q01 
AH 


Channels 4-7-QODAH 


DREQN SAMPLING 


EOP# 
SAMPLING 
a = ASYNCHRONOUS 
1 = SYNCHRONOUS 


LOW PRIORITY LEVEL SET 
DO = CHANNEL 0(4) 
LOWEST 
01 = 
1(5) 
10= 
2(6) 
11 = 
3(7) 


Mode Register I (write only) 


Port Addresses- 
Channels 0-3-Q00BH 


Channels 4-7-QOCBH 


I Bl I BO I TI I AI I T1 I TO I C1 I CO I 
I 
I 
CHANNEL SELECT 
00 = CHANNEL 
0(4) 
01 = 
1(5) 
10= 
2(6) 
11 = 
3(7) 


TRANSfER 
TYPE 
00 = VERifY 
01 = WRITE 
10 = READ 
11 = ILLEGAL 
XX If 
IN CASCADE MODE 


AUTO-INITIALIZE 
o = DISABLE, 
1 = ENABLE 


TARGET INCREMENT/DECREMENT 
o = INCREMENT TARGET 
1 = DECREMENT TARGET • 
X If TARGET HOLD ENABLED 


DATA TRANSfER 
MODE 
00 = DEMAND MODE 
01 = SINGLE TRANSfER 
MODE 
10 = BLOCK MODE 
11 = CASCADE MODE 
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Mode Register II (write only) 


Port Addresses- 
Channels 0-3-Q01 
BH 
Channels 4-7-QODBH 


Port Addresses- 
Channels 0-3-Q009H 
Channels 4-7-QOC9H 


TARGET HOLD 
o = INCREMENT/DECREMENT 
1 = HOLD 
. 


REQUESTER INCREMENT 
o = INCREMENT 
1 = DECREMENT· 
X IF REQUESTER HOLD ENABLED 


REQUESTER HOLD 
o = INCREMENT/DECREMENT 
1 = HOLD 


TARGET DEVICE TYPE 


REQUESTER DEVICE TYPE 
0= 
MEMORY 
1 = INPUT/OUTPUT 


TRANSFER CYCLES 
o = ONE-CYCLE 
(FLY-BY) 
1 = TWO-CYCLE 


CHANNEL 
SELECT 
SEE MODE REGISTER I 


REQUEST SERVICE 
o = REMOVE REQUEST 
1 = ASSERT REQUEST 
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CHANNEL 
0(4) 
REQUEST 


CHANNEL 
1(5) 
REQUEST 


CHANNEL 
2(6) 
REQUEST 


CHANNEL 
3(7) 
REQUEST 


Port Addresses- 
Channels 
0-3-QOOAH 


Channels 
4-7-QOCAH 


MASK 
SET BIT 
o = CLEAR 
MASK 
1 = SET MASK 


Port Addresses- 
Channels 
0-3-QOOFH 


Channels 
4-7-QOCFH 


CHANNEL 
0(4) 
MASK 
BIT 


CHANNEL 
1(5) 
MASK 
BIT 


CHANNEL 
2(6) 
MASK 
BIT 


CHANNEL 
3(7) 
MASK 
BIT 


Port Addresses- 
Channels 
0-3--Q008H 


Channels 
4- 7--QOC8H 


CHANNEL 0(4) 
EXPIRED 


CHANNEL 1(5) 
EXPIRED 


CHANNEL 2(6) 
EXPIRED 


CHANNEL 3(7) 
EXPIRED 
1 = EXPIRED 


CHANNEL 0(4) 
REQUEST 


CHANNEL 1(5) 
REQUEST 


CHANNEL 2(6) 
REQUEST 


CHANNEL 3(7) 
REQUEST 
1 = REQUEST PENDING 


Port Addresses- 
Channels 
0-3--Q018H 


Channels 
4-7--QOD8H 


CHANNEL SELECT 
SEE MODE REGISTER I 


TARGET BUS SIZE 


REQUESTER BUS SIZE 


Bus Size Encoding: 
00 = Reserved 
by Intel 10 = 16-bit Bus 
01 = 32-bit 
Bus" 
11 = 8-bit Bus 
"If programmed 
as 32-bit 
bus width, 
the corresponding 
device 
will be accessed 
in two 16-bit cycles 
provided 
that the data is 
aligned 
within 
word boundary. 


Port Addresses- 
Channels 
0-3--Q019H 


Channels 
4-7--QOD9H 


CHAINING ENABLE BIT 
o = DISABLE CHAINING MODE 
1 = ENABLE CHAINING MODE 


3.8 8237 A Compatibility 


The register arrangement of the 82370 DMA Con- 
troller is a superset of the 8237A DMA Controller. 
Functionally the 82370 DMA Controller is very differ- 
ent from the 8237A. Most of the functions of the 
8237A are performed also by the 82370. The follow- 
ing discussion points out the differences between 
the 8237A and the 82370. 


The 8237A is limited to transfers between I/O and 
memory only (except in one special case, where two 
channels can be used to perform memory-to-memo- 
ry transfers). The 82370 DMA Controller can transfer 
between any combination of memory and I/O. Sev- 
eral other features of the 8237A are enhanced or 
expanded in the 82370 and other features are add- 
ed. 


The 8237A is an 8-bit only DMA device. For pro- 
gramming compatibility, all of the 8-bit registers are 
preserved in the 82370. The 82370 is programmed 
via 8-bit registers. The address registers in the 
82370 are 24-bit registers in order to support the 
80376's 24-bit bus. The Byte Count Registers are 
24-bit registers, allowing support of larger data 
blocks than possible with the 8237A. 


All of the 8237A's operating modes are supported 
by the 82370 (except the cumbersome two-channel 
memory-to-memory transfer). The 82370 performs 
memory-to-memory transfers using only one chan- 
nel. The 82370 has the added features of buffer 
pipelining (Buffer Chaining Process) and program- 
mable priority levels. 


The 82370 also adds the feature of address regis- 
ters for both destination and source. These address- 
es may be incremented, decremented, or held con- 
stant, as required by the application of the individual 
channel. This allows any combination of destination 
and source device. 


CHANNEL 0(4) 
BASE Et.lPTY 


CHANNEL 
1(5) 
BASE Et.lPTY 


CHANNEL 2(6) 
BASE Et.lPTY 


CHANNEL 3(7) 
BASE Et.lPTY 


Each DMA channel has associated with it a Target 
and a Requester. In the 8237A, the Target is the 
device which can be accessed by the address regis- 
ter, the Requester is the device which is accessed 
by the DMA Acknowledge signals and must be an 
I/O device. 


4.0 
PROGRAMMABLE 
INTERRUPT 
CONTROLLER 
(PIC) 


4.1 
Functional 
Description 


The 82370 Programmable Interrupt Controller (PIC) 
consists of three enhanced 82C59A Interrupt Con- 
trollers. These three controllers together provide 15 
external and 5 internal interrupt request inputs. Each 
external request input can be cascaded with an ad- 
ditional 82C59A slave controller. This scheme al- 
lows the 82370 to support a maximum of 120 
(15 x 8) external interrupt request inputs. 


Following one or more interrupt requests, the 82370 
PIC issues an interrupt signal to the 80376. When 
the 80376 host processor responds with an interrupt 
acknowledge signal, the PIC will arbitrate between 
the pending interrupt requests and place the inter- 
rupt vector associated with the highest priority pend- 
ing request on the data bus. 


The major enhancement in the 82370 PIC over the 
82C59A is that each of the interrupt request inputs 
can be individually programmed with its own inter- 
rupt vector, allowing more flexibility in interrupt vec- 
tor mapping. 


The block diagram of the 82370 Programmable In- 
terrupt Controller is shown in Figure 4-1. Internally, 


inter 


the PIC consists of three 82C59A banks: A, Band C. 
The three banks are cascaded to one another: C is 
cascaded to B, B is cascaded to A. The INT output 
of Bank A is used externally to interrupt the 80376. 


Bank A has nine interrupt request inputs (two are 
unused), and Banks Band 
C have eight interrupt 
request inputs. Of the fifteen external interrupt re- 
quest inputs, two are shared by other functions. Spe- 
cifically, the Interrupt Request 3 input (IRQ3#) can 
be used as the Timer 2 output (TOUT2#). This pin 
can be used in three different ways: IRQ3# input 
only, TOUT2# 
output only, or using TOUT2# 
to 
generate an IRQ3# interrupt request. Also, the in- 
terrupt Request 9 input (IRQ9#) can be used as 
OMA Request 4 input (OREQ 4). Typically, only 
IRQ9# or OREQ4 can be used at a time. 


All three banks are identical, with the exception of 
the IRQ1.5 on Bank A. Therefore, only one bank will 
be discussed. In the 82370,PIC, all external requests 
can be cascaded into and each interrupt controller 
bank behaves like a' master. As compared to the 
82C59A, the enhancements in the banks are: 
- 
All interrupt vectors are individually programma- 
ble. (In the 82C59A, the vectors must be pro- 
grammed in eight consecutive interrupt vector lo- 
cations.) 
- 
The cascade address is provided on the Data 
Bus (00-07). (In the 82C59A, three dedicated 
control signals (CASO,CAS1, CAS2) are used for 
master/slave cascading.) 
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The block diagram of a bank is shown in Figure 4-2. 
As can be seen from this figure, the bank consists of 
six major blocks: the Interrupt Request Register 
(IRR), the In-Service Register (ISR), the Interrupt 
Mask Register (IMR), the Priority Resolver (PR), the 
Vector Registers (VR), and the Control Logic. The 
functional description of each block is included be- 
low. 


INTERRUPT 
REQUEST 
(IRR) AND 
IN-SERVICE 
REGISTER 
(ISR) 


The interrupts at the Interrupt Request (IRQ) input 
lines are handled by two registers in cascade, the 
Interrupt Request Register (IRR) and the In-Service 
Register (ISR). The IRR is used to store all interrupt 
levels which are requesting service; and the ISR is 
used to store all interrupt levels which are being 
serviced. 


This logic block determines the priorities of the bits 
set in the IRR. The highest priority is selected and 
strobed into the corresponding bit of the ISR during 
an Interrupt Acknowledge cycle. 


The IMR stores the bits which mask the interrupt 
lines to be masked (disabled). The IMR operates on 
the IRR. Masking of a higher priority input will not 
affect the interrupt request lines of lower priority. 


This block contains a set of Vector Registers, one 
for each interrupt request line, to store the pre-pro- 
grammed interrupt vector number. The correspond- 
ing vector number will be driven onto the Data Bus 
of the 82370 during the Interrupt Acknowledge cy- 
cle. 


The Control Logic coordinates the overall operations 
of the other internal blocks within the same bank. 
This logic will drive the Interrupt Output signal (INT) 
HIGH when one or more unmasked interrupt inputs 
are active (LOW). The 1NToutput signal goes direct- 
ly to the 80376 (in bank A) or to another bank to 
which this bank is cascaded (see Figure 4-1). Also, 
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this logic will recognize an Interrupt Acknowledge 
cycle (via M/IO#, D/C# and W/R# signals). During 
this bus cycle, the Control Logic will enable the cor- 
responding Vector Register to drive the interrupt 
vector onto the Data Bus. 


In bank A, the Control Logic is also responsible for 
handling the special ICW2 interrupt request input 
(IRQ1.5). 


There are 15 external Interrupt Request inputs and 5 
internal Interrupt Requests. The external request in- 
puts are: IRQ3#, IRQ9#, IRQ11# to IRQ23#. They 
are shown in bold arrows in Figure 4-1. All IRQ in- 
puts are active LOW and they can be programmed 
(via a control bit in the Initialization Command Word 
1 (ICW1)) to be either edge-triggered or level-trig- 
gered. In order to be recognized as a valid interrupt 
request, the interrupt input must be active (LOW) un- 
til the first INTA cycle (see Bus Functional Descrip- 
tion). Note that all 15 external Interrupt Request in- 
puts have weak internal pull-up resistors. 


As mentioned earlier, an 82C59A can be cascaded 
to each external interrupt input to expand the inter- 
rupt capacity to a maximum of 120 levels. Also, two 
of the interrupt inputs are dual functions: IRQ3# can 
be used as Timer 2 output (TOUT2#) and IRQ9# 
can be used as DREQ4 input. IRQ3# is a bidirec- 
tional dual function pin. This interrupt request input is 
wired-OR with the output of Timer 2 (TOUT2#). If 
only IRQ3# function is to be used, Timer 2 should 
be programmed so that OUT2 is LOW. Note that 
TOUT2# can also be used to generate an interrupt 
request to IRQ3# input. 


The five internal interrupt requests serve special 
system functions. They are shown in Table 4-1. The 
following paragraphs describe these interrupts. 


Interrupt 
Request 
Interrupt 
Source 


IRQO# 
Timer 3 Output (TOUT3) 
IRQ8# 
Timer 0 Output (TOUTO) 
IRQ1# 
DMA Chaining Request 
IRQ4# 
DMA Terminal Count 
IRQ1.5# 
ICW2 Written 


IRQ8# and IRQO# interrupt requests are initiated 
by the output of Timers 0 and 3, respectively. Each 
of these requests is generated by an edge-detector 
flip-flop. 


The flip-flops are activated by the following condi- 
tions: 
Set 
- 
Rising edge of timer output (TOUT); 
Clear - 
Interrupt acknowledge for this request; OR 
Request is masked (disabled); OR Hard- 
ware Reset. 


These interrupt requests are generated 
by the 
82370 
DMA 
Controller. 
The 
chaining 
request 
(IRQ1#) indicates that the DMA Base Register is 
not loaded. The Terminal Count request (IRQ4#) in- 
dicates that a software DMA request was cleared. 


Whenever an Initialization Control Word 2 (ICW2) is 
written to a Bank, a special ICW2 interrupt request is 
generated. The interrupt will be cleared when the 
newly programmed ICW2 Register is read. This in- 
terrupt request is in Bank A at level 1.5. This inter- 
rupt request is internally ORed with the Cascaded 
Request from Bank B and is always assigned a high- 
er priority than the Cascaded Request. 


This special interrupt is provided to support compati- 
bility with the original 82C59A. A detailed description 
of this interrupt is discussed in the Programming 
section. 


During an Interrupt Acknowledge cycle, if there is no 
active pending request, the PIC will automatically 
generate a default vector. This vector corresponds 
to the IRQ7# vector in bank A. 


4.2.2 INTERRUPT 
OUTPUT 
(INT) 


The INT output pin is taken directly from bank A. 
This signal should be tied to the Maskable Interrupt 
Request (INTR) of the 80376. When this signal is 
active (HIGH), it indicates that one or more internal/ 
external interrupt requests are pending. The 80376 
is expected to respond with an interrupt acknowl- 
edge cycle. 


4.3 
Bus Functional 
Description 


The INT output of bank A will be activated as a result 
of any unmasked interrupt request. This may be a 
non-cascaded or cascaded request. After the PIC 
has driven the INT signal HIGH, the 80376 will reo 
spond by performing two interrupt acknowledge cy- 
cles. The timing diagram in Figure 4-3 shows a typi- 
cal interrupt acknowledge 
process between the 
82370 and the 80376 CPU. 
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NOTE: 
What is actually driven on the Data Bus depends on if the current interrupt request is a Slave Request. 


INTA Cycle 1 
INTA Cycle 2 
OOH 
Vector 
Slave Address 
High Impedence" 
NON-SLAVE REQUEST 
SLAVE REQUEST 
"Slave will place a vector at this time. 


After activating the INT signal, the 82370 monitors 
the status lines (M/IO#, 
D/C#, 
W/R#) 
and waits 
for the 80376 to initiate the first interrupt acknowl- 
edge cycle. In the 80376 environment, two succes- 
sive interrupt acknowledge cycles (INTA) marked by 
M/IO# = LOW, D/C# = LOW, and W/R # = LOW 
are performed. During the first INTA cycle, the PIC 
will determine the highest priority request. Assuming 
this interrupt input has no external Slave Controller 
cascaded to it, the 82370 will drive the Data Bus 
with OOH in the first INTA cycle. During the second 
INTA cycle, the 82370 PIC will drive the Data Bus 
with the corresponding pre-programmed interrupt 
vector. 


If the PIC determines (from the ICW3) that this inter- 
rupt input has an external Slave Controller cascaded 
to it, it will drive the Data Bus with the specific Slave 
Cascade Address (instead of OOH) during the first 
INTA cycle. This Slave Cascade Address is the pre- 
programmed content in the corresponding Vector 
Register. This means that no Slave Address should 
be chosen to be OOH. Note that the Slave Address 
and Interrupt Vector are different interpretations of 
the same thing. They are both the contents of the 
programmable Vector Register. During the second 
INTA cycle, the Data Bus will be floated so that the 
external Slave Controller can drive its interrupt vec- 
tor on the bus. Since the Slave Interrupt Controller 
resides on the system bus, bus transceiver enable 
and direction control logic must take this into consid- 
eration. 


In order to have a successful interrupt service, the 
interrupt request input must be held valid (LOW) until 
the beginning of the first interrupt acknowledge cy- 
cle. If there is no pending interrupt request when the 
first INTA cycle is generated, the PIC will generate a 
default vector, which is the IRQ7 vector (Bank A, 
level 7). 


According to the Bus Cycle definition of the 80376, 
there will be four Bus Idle States between the two 
interrupt acknowledge cycles. These idle bus cycles 
will be initiated by the 80376. Also, during each inter- 
rupt acknowledge cycle, the internal Wait State Gen- 
erator of the 82370 will automatically generate the 
required number of wait states for internal delays. 


A variety of modes and commands are available for 
controlling the 82370 PIC. All of them are program- 
mable; that is, they may be changed dynamically un- 
der software control. In fact, each bank can be pro- 
grammed individually to operate in different modes. 
With these modes and commands, many possible 
configurations 
are 
conceivable, 
giving the 
user 
enough versatility for almost any interrupt controlled 
application. 


This section is not intended to show how the 82370 
PIC can be programmed. Rather, it describes the 
operation in different modes. 


Upon completion of an interrupt service routine, the 
interrupted bank needs to be notified so its ISR can 
be updated. This allows the PIC to keep track of 
which interrupt levels are in the process of being 
serviced and their relative priorities. Three different 
End-Of-Interrupt (EOI) formats are available. They 
are: Non-Specific EOI Command, Specific EOI Com- 
mand, and Automatic EOI Mode. Selection of which 
EOI to use is dependent upon the interrupt opera- 
tions the user wishes to perform. 


If the 82370 is NOT programmed in the Automatic 
EOI Mode, an EOI command must be issued by the 
80376 to the specific 82370 PIC Controller Bank. 
Also, if this controller bank is cascaded to another 
internal bank, an EOI command must also be sent to 
the bank to which this bank is cascaded. For exam- 
ple, if an interrupt request of Bank C in the 82370 
PIC is serviced, an EOI should be written into Bank 
C, Bank B and Bank A. If the request comes from an 
external interrupt controller cascaded to Bank C, 
then an EOI should be written into the external con- 
troller as well. 


A Non-Specific EOI command sent from the 80376 
lets the 82370 PIC bank know when a service rou- 
tine has been completed, without specification of its 
exact interrupt level. The respective interrupt bank 
automatically determines the interrupt level and re- 
sets the correct bit in the ISA. 


To take advantage of the Non-Specific EOI, the in- 
terrupt bank must be in a mode of operation in which 
it can predetermine its in-service routine levels. For 
this reason, the Non-Specific EOI command should 
only be used when the most recent level acknowl- 
edged and serviced is always the highest priority lev- 
el (Le. in the Fully Nested Mode structure to be de- 
scribed below). When the interrupt bank receives a 
Non-Specific EOI command, it simply resets the 
highest priority ISR bit to indicate that the highest 
priority routine in service is finished. 


Special consideration should be taken when decid- 
ing to use the Non-Specific EOI command. Here are 
two operating conditions in which it is best NOT 
used since the Fully Nested Mode structure will be 
destroyed: 
- 
Using the Set Priority command within an inter- 
rupt service routine. 


- 
Using a Special Mask Mode. 


These conditions are covered in more detail in their 
own sections, but are listed here for reference. 


Unlike a Non-Specific EOI command which automat- 
ically resets the highest priority ISR bit, a Specific 
EOI command specifies an exact ISR bit to be reset. 
Anyone of the IRQ levels of an interrupt bank can 
be specified in the command. 


The Specific EOI command is needed to reset the 
ISR bit of a completed service routine whenever the 
interrupt bank is not able to automatically determine 
it. The Specific EOI command can be used in all 
conditions of operation, including those that prohibit 
Non-Specific 
EOI 
command 
usage 
mentioned 
above. 


When programmed in the Automatic EOI Mode, the 
80376 no longer needs to issue a command to notify 
the interrupt bank it has completed an interrupt rou- 
tine. The interrupt bank accomplishes this by per- 
forming a Non-Specific EOI automatically at the end 
of the second INTA cycle. 


Special consideration should be taken when decid- 
ing to use the Automatic EOI Mode because it may 
disturb the Fully Nested Mode structure. In the Auto- 
matic EOI Mode, the ISR bit of a routine in service is 
reset right after it is acknowledged, thus leaving no 
designation in the ISR that a service routine is being 
executed. If any interrupt request within the same 
bank occurs during this time and interrupts are en- 
abled, it will get serviced regardless of its priority. 
Therefore, when using this mode, the 80376 should 
keep its interrupt request input disabled during exe- 
cution of a service routine. By doing this, higher pri- 
ority interrupt levels will be serviced only after the 
completion of a routine in service. This guideline re- 
stores the Fully Nested Mode structure. However, in 
this scheme, a routine in service cannot be interrupt- 
ed since the host's interrupt request input is dis- 
abled. 


The 82370 PIC provides various methods for arrang- 
ing the interrupt priorities of the interrupt request in- 
puts to suit different applications. The following sub- 
sections explain these methods in detail. 


4.4.2.1 Fully Nested 
Mode 


The Fully Nested Mode of operation is a general pur- 
pose priority mode. This mode supports a multi-level 
interrupt structure in which all of the Interrupt Re- 
quest (IRQ) inputs within one. bank are arranged 
from highest to lowest. 
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Unless otherwise programmed, the Fully Nested 
Mode is entered by default upon initialization. At this 
time, 
IRQO# 
is 
assigned 
the 
highest 
priority 
(priority= 0) and IRQ7# 
the lowest (priority= 7). 
This default priority can be changed, as will be ex- 
plained later in the Rotating Priority Mode. 


When an interrupt is acknowledged, the highest pri- 
ority request is determined from the Interrupt Re- 
quest Register (IRR) and its vector is placed on the 
bus. In addition, the corresponding bit in the In-Serv- 
ice Register (ISR) is set to designate the routine in 
service. This ISR bit will remain set until the 80376 
issues an End Of Interrupt (EOI) command immedi- 
ately before returning from the service routine; or 
alternately, if the Automatic End Of Interrupt (AEOI) 
bit is set, the ISR bit will be reset at the end of the 
second INTA cycle. 


While the ISR bit is set, all further interrupts of the 
same or lower priority are inhibited. Higher level in- 
terrupts can still generate an interrupt, which will be 
acknowledged only if the 80376 internal interrupt en- 
able flip-flop has been reenabled (through software 
inside the current service routine). 


4.4.2.2 Automatic 
Rotation-Equal 
Priority 
Devices 


Automatic rotation of priorities serves in applications 
where the interrupting devices are of equal priority 


within an interrupt bank. In this kind of environment, 
once a device is serviced, all other equal priority pe- 
ripherals should be given a chance to be serviced 
before the original device is serviced again. This is 
accomplished by automatically assigning a device 
the lowest priority after being serviced. Thus, in the 
worst case, the device would have to wait until all 
other peripherals connected to the same bank are 
serviced before it is serviced again. 


There are two methods of accomplishing automatic 
rotation. One is used in conjunction with the Non- 
Specific EOI command and the other is used with 
the Automatic EOI mode. These two methods are 
discussed below. 


When the Rotate On Non-Specific EOI command is 
issued, the highest ISR bit is reset as in a normal 
Non-Specific EOI command. However, after it is re- 
set, the corresponding Interrupt Request (IRQ) level 
is assigned the lowest priority. Other IRQ priorities 
rotate to conform to the Fully Nested Mode based 
on the newly assigned low priority. 


Figure 4-4 shows how the Rotate On Non-Specific 
EOI command affects the interrupt priorities. As- 
sume the IRQ priorities were assigned with IRQOthe 
highest and IRQ7 the lowest. IRQ6 and IRQ4 are 
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already 
in service 
but neither 
is completed. 
Being 
the 
higher 
priority 
routine, 
IRQ4 
is necessarily 
the 
routine 
being 
executed. 
During the IRQ4 routine, 
a 
rotate 
on Non-Specific 
EOI command 
is executed. 


When this happens, 
Bit 4 in the ISR is reset. 
IRQ4 
then becomes 
the lowest priority and IRQ5 becomes 
the highest. 


The 
Rotate 
On Automatic 
EOI Mode 
works 
much 
like the Rotate On Non-Specific 
EOI Command. 
The 
main difference 
is that priority rotation 
is done auto- 
matically 
after the second 
INTA cycle of an interrupt 


request. 
To enter or exit this mode, a Rotate-On-Au- 
tomatic-EOI 
Set Command 
and Rotate-On-Automat- 
ic-EOI Clear Command 
is provided. 
After this mode 


is entered, 
no other commands 
are needed as in the 
normal 
Automatic 
EOI Mode. 
However, 
it must 
be 
noted 
again that when using any form of the Auto- 
matic 
EOI 
Mode, 
special 
consideration 
should 
be 
taken. The guideline 
presented 
in the Automatic 
EOI 
Mode also applies 
here. 


4.4.2.3 
Specific 
Rotation-Specific 
Priority 


Specific 
rotation 
gives the user versatile 
capabilities 
in interrupt 
controlled 
operations. 
It serves 
in those 
applications 
in which a specific 
device's 
interrupt 
pri- 
ority must be altered. 
As opposed 
to Automatic 
Ro- 


tation 
which 
will 
automatically 
set 
priorities 
after 
each interrupt 
request is serviced, 
specific 
rotation 
is 
completely 
user controlled. 
That is, the user selects 
which 
interrupt 
level is to receive 
the lowest 
or the 
highest 
priority. 
This can be done 
during 
the main 
program 
or within interrupt 
routines. 
Two specific 
ro- 


tation commands 
are available 
to the user: Set Prior- 
ity Command 
and 
Rotate 
On 
Specific 
EOt 
Com- 
mand. 


The Set Priority Command 
allows the programmer 
to 
assign 
an IRQ level the lowest 
priority. 
All other 
in- 
terrupt 
levels will conform 
to the Fully Nested 
Mode 
based on the newly assigned 
low priority. 


The Rotate 
On Specific 
EOI Command 
is literally 
a 
combination 
of the 
Set Priority 
Command 
and the 
Specific 
EOI Command. 
Like the Set Priority 
Com- 
mand, a specified 
IRQ level is assigned 
lowest priori- 


ty. Like the Specific 
EOI Command, 
a specified 
level 
will be reset in the ISA. Thus, this command 
accom- 
plishes 
both tasks in one single command. 


4.4.2.4 
Interrupt 
Priority 
Mode Summary 


In order to simplify 
understanding 
the many modes 
of interrupt 
priority, Table 4-2 is provided 
to bring out 
their summary 
of operations. 


4.4.3 INTERRUPT 
MASKING 


VIA INTERRUPT 
MASK REGISTER 


Each bank in the 82370 
PIC has an Interrupt 
Mask 
Register 
(IMR) which enhances 
interrupt 
control 
ca- 


Interrupt 
Operation 
Effect On Priority After EOI 
Priority 
Mode 
Summary 
Non-Speciflcl 
Automatic 
Specific 


Fully-Nested 
Mode 
IRQO # - Highest Priority 
No change 
in priority. 
Not Applicable. 


IRQ7 # - Lowest Priority 
Highest ISR bit is reset. 


Automatic 
Rotation 
Interrupt 
level just 
Highest ISR bit is reset 
Not Applicable. 


(Equal Priority Devices) 
serviced 
is the lowest 
and the corresponding 
priority. 
level becomes 
the lowest 


Other priorities 
rotate to 
priority. 


conform 
to Fully-Nested 
Mode. 


Specific 
Rotation 
User specifies 
the 
Not Applicable. 
As described 
under 
(Specific 
Priority Devices) 
lowest priority level. 
"Operation 
Summary". 


Other priorities 
rotate to 
conform 
to Fully-Nested 
Mode. 
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pabilities. This IMR allows individual IRQ masking. 
When an IRQ is masked, its interrupt request is dis- 
abled until it is unmasked. Each bit in the 8-bit IMR 
disables one interrupt channel if it is set (HIGH). Bit 
o masks IRQO, Bit 1 masks IRQ1 and so forth. 
Masking an IRQ channel will only disable the corre- 
sponding channel and does not affect the others' 
operations. 


The IMR acts only on the output of the IRR. That is, 
if an interrupt occurs while its IMR bit is set, this 
request is not "forgotten". 
Even with an IRQ input 
masked, it is still possible to set the IRR. Therefore, 
when the IMR bit is reset, an interrupt request to the 
80376 will then be generated, providing that the IRQ 
request remains active. If the IRQ request is re- 
moved before the IMR is reset, the Default Interrupt 
Vector (Bank A, level 7) will be generated during the 
interrupt acknowledge cycle. 


In the Fully Nested Mode, all IRQ levels of lower 
priority than the routine in service are inhibited. How- 
ever, in some applications, it may be desirable to let 
a lower priority interrupt request to interrupt the rou- 
tine in service. One method to achieve this is by 
using the Special Mask Mode. Working in conjunc- 
tion with the IMR, the Special Mask Mode enables 
interrupts from all levels except the level in service. 
This is usually done inside an interrupt service rou- 
tine by masking the level that is in service and then 
issuing the Special Mask Mode Command. Once the 
Special Mask Mode is enabled, it remains in effect 
until it is disabled. 


4.4.4 EDGE OR LEVEL 
INTERRUPT 
TRIGGERING 


Each bank in the 82370 PIC can be programmed 
independently for either edge or level sensing for the 


DATA BUS 
INTA# 
(fROIo4 BUS CONTROLLER) 


interrupt request signals. Recall that all IRQ inputs 
are active LOW. Therefore, in the edge triggered 
mode, an active edge is defined as an input tran- 
sition from an inactive (HIGH) to active (LOW) state. 
The interrupt input may remain active without gener- 
ating another interrupt. During level triggered mode, 
an interrupt request will be recognized by an active 
(LOW) input, and there is no need for edge detec- 
tion. However, the interrupt request must be re- 
moved before the EOI Command is issued, or the 
80376 must be disabled to prevent a second false 
interrupt from occurring. 


In either modes, the interrupt request input must be 
active (LOW) during the first INTA cycle in order to 
be recognized. Otherwise, the Default Interrupt Vec- 
tor will be generated at level 7 of Bank A. 


As mentioned previously, the 82370 allows for exter- 
nal Slave interrupt controllers to be cascaded to any 
of its external interrupt request pins. The 82370 PIC 
indicates that an external Slave Controller is to be 
serviced by putting the contents of the Vector Regis- 
ter associated with the particular request on the 
80376 Data Bus during the first INTA cycle (instead 
of OOHduring a non-slave service). The external log- 
ic should latch the vector on the Data Bus using the 
INTA status signals and use it to select the external 
Slave Controller to be serviced (see Figure 4-5). The 
selected Slave will then respond to the second INTA 
cycle and place its vector on the Data Bus. This 
method requires that if external Slave Controllers 
are used in the system, no vector should be pro- 
grammed to OOH. 


Since the external Slave Cascade Address is provid- 
ed on the Data Bus during INTA cycle 1, an external 
latch is required to capture this address for the Slave 
Controller. A simple scheme is depicted in Figure 
4-5 below. 
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CAS(O-7) 
TO SLAVE 
8259'. 
'--( 


LATCH HERE 


intJ 


4.4.5.1 
Special 
Fully Nested 
Mode 


This mode will be used where cascading is em- 
ployed and the priority is to be conserved within 
each Slave Controller. The Special Fully Nested 
Mode is similar to the "regular" Fully Nested Mode 
with the following exceptions: 
- 
When an interrupt request from a Slave Control- 
ler is in service, this Slave Controller is not 
locked out from the Master's priority logic. Fur- 
ther interrupt requests from the higher priority 
logic within the Slave Controller will be recog- 
nized by the 82370 PIC and will initiate interrupts 
to the 80376. In comparing to the "regular" Fully 
Nested Mode, the Slave Controller is masked out 
when its request is in service and no higher re- 
quests from the same Slave Controller can be 
serviced. 


- 
Before exiting the interrupt service routine, the 
software has to check whether the interrupt serv- 
iced was the only request from the Slave Con- 
troller. This is done by sending a Non-Specific 
EOI Command to the Slave Controller and then 
reading its In Service Register. If there are no 
requests in the Slave Controller, a Non-Specific 
EOI can be sent to the corresponding 82370 PIC 
bank also. Otherwise, no EOI should be sent. 


The 82370 PIC provides several ways to read differ- 
ent status of each interrupt bank for more flexible 
interrupt control operations. These include polling 
the highest priority pending interrupt request and 
reading the contents of different interrupt status reg- 
isters. 


4.4.6.1 
Poll Command 


The 82370 PIC supports status polling operations 
with the Poll Command. In a Poll Command, the 
pending interrupt request with the highest priority 
can be determined. To use this command, the INT 
output is not used, or the 80376 interrupt is disabled. 
Service to devices is achieved by software using the 
Poll Command. 


This mode is useful if there is a routine command 
common to several levels so that the INTA se- 
quence is not needed. Another application is to use 
the Poll Command to expand the number of priority 
levels. 


Notice that the ICW2 mechanism is not supported 
for the Poll Command. However, if the Poll Com- 
mand is used, the programmable Vector Registers 
are of no concern since no INTA cycle will be gener- 
ated. 


4.4.6.2 
Reading 
Interrupt 
Registers 


The contents of each interrupt register (lRR, ISR, 
and IMR) can be read to update the user's program 
on the present status of the 82370 PIC. This can be 
a versatile tool in the decision making process of a 
service routine, giving the user more control over 
interrupt operations. 


The reading of the IRR and ISR contents can be 
performed via the Operation Control Word 3 by us- 
ing a Read Status Register Command and the con- 
tent of IMR can be read via a simple read operation 
of the register itself. 


Each bank of the 82370 PICconsists of a set of 8-bit 
registers to control its operations. The address map 
of all the registers is shown in Table 4-3 below. 
Since all three register sets are identical in functions, 
only one set will be described. 


Functionally, each register set can be divided into 
five groups. They are: the four Initialization Com- 
mand Words (ICW's), the three Operation Control 
Words (OCW's), the Poll/Interrupt Request/In-Serv- 
ice Register, the Interrupt Mask Register, and the 
Vector Registers. A description of each group fol- 
lows. 
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Port 
Access 
Register 
Description 
Address 


20H 
Write 
Bank B ICW1, OCW2, or OCW3 


Read 
Bank B Poll, Request 
or In-Service 
Status Register 
21H 
Write 
Bank B ICW2, ICW3, ICW4, OCW1 
Read 
Bank B Mask Register 
22H 
Read 
BankB 
ICW2 
28H 
Read/Write 
IRQ8 Vector Register 
29H 
Read/Write 
IRQ9 Vector Register 
2AH 
Read/Write 
Reserved 
2BH 
, 
Read/Write 
IRQ11 Vector 
Register 
2CH 
Read/Write 
IRQ12 Vector 
Register 
20H 
Read/Write 
IRQ13 Vector 
Register 
2EH 
Read/Write 
IRQ14 Vector Register 
2FH 
Read/Write 
IRQ15 Vector Register 


AOH 
Write 
Bank C ICW1, OCW2, or OCW3 
Read 
Bank C Poll, Request 
or In-Service 
Status Register 
A1H 
Write 
Bank C ICW2, ICW3, ICW4, OCW1 
Read 
Bank C Mask Register 
A2H 
Read 
BankCICW2 
A8H 
Read/Write 
IRQ16 Vector Register 
A9H 
Read/Write 
IRQ1? Vector Register 
AAH 
Read/Write 
IRQ18 Vector Register 
ABH 
Read/Write 
IRQ19 Vector Register 
ACH 
Read/Write 
IRQ20 Vector 
Register 
AOH 
Read/Write 
IRQ21 Vector 
Register 
AEH 
Read/Write 
IRQ22 Vector Register 
AFH 
Read/Write 
IRQ23 Vector Register 


30H 
Write 
Bank A ICW1, OCW2, or OCW3 
Read 
Bank A Poll, Request 
or In-Service 
Status Register 
31H 
Write 
Bank A ICW2, ICW3, ICW4, OCW1 
Read 
Bank A Mask Register 
32H 
Read 
Bank ICW2 
38H 
Read/Write 
IRQO Vector Register 
39H 
Read/Write 
IRQ1 Vector 
Register 
3AH 
Read/Write 
IRQ1.5 Vector Register 
3BH 
Read/Write 
IRQ3 Vector 
Register 
3CH 
Read/Write 
IRQ4 Vector Register 
30H 
Read/Write 
Reserved 
3EH 
Read/Write 
Reserved 
3FH 
Read/Write 
IRQ? Vector Register 
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Before normal operation can begin, the 82370 PIC 
must be brought to a known state. There are four 
8-bit Initialization Command Words in each interrupt 
bank to setup the necessary conditions and modes 
for proper operation. Except for the second com- 
mand word (ICW2)which is a read/write register, the 
other three are write-only registers. Without going 
into detail of the bit definitions of the command 
words, the following subsections give a brief de- 
scription of what functions each command word 
controls. 


The ICW1 has three major functions. They are: 
- 
To select between the two IRQ input triggering 
modes (edge- or level-triggered); 
- 
To designate whether or not the interrupt bank is 
to be used alone or in the cascade mode. If the 
cascade mode is desired, the interrupt bank will 
accept ICW3 for further cascade mode program- 
ming. Otherwise, no ICW3 will be accepted; 


- 
To determine whether or not ICW4 will be issued; 
that is, if any of the ICW4 operations are to be 
used. 


ICW2 is provided for compatibility with the 82C59A 
only. Its contents do not affect the operation of the 
interrupt bank in any way. Whenever the ICW2 of 
any of the three banks is written into, an interrupt is 
generated from bank A at level 1.5. The interrupt 
request will be cleared after the ICW2 register has 
been read by the 80376. The user is expected to 
program the corresponding vector register or to use 
it as an indicator that an attempt was made to alter 
the contents. Note that each ICW2 register has dif- 
ferent addresses for read and write operations. 


The interrupt bank will only accept an ICW3 if pro- 
grammed in the external cascade mode (as indicat- 
ed in ICW1). ICvy3 is used for specific programming 
within the cascade mode. The bits in ICW3 indicate 
which interrupt request inputs have a Slave cascad- 
ed to them. This will subsequently affect the inter- 
rupt vector generation during the interrupt acknowl- 
edge cycles as described previously. 


The ICW4 is accepted only if it was selected in 
ICW1. This command word register serves two func- 
tions: 
- 
To select either the Automatic Eol mode or soft- 
ware Eol mode; 
- 
To select if the Special Nested mode is to be 
used in conjunction with the cascade mode. 


4.5.2 OPERATION 
CONTROL 
WORDS 
(OCW) 


Once initialized by the ICW's, the interrupt banks will 
be operating in the Fully Nested Mode by default 
and they are ready to accept interrupt requests. 
However, the operations of each interrupt bank can 
be further controlled or modified by the use of 
oCW's. Three oCW's are available for programming 
various modes and commands. Note that all oCW's 
are 8-bit write-only registers. 


The modes and operations controlled by the oCW's 
are: 
- 
Fully Nested Mode; 
- 
Rotating Priority Mode; 
- 
Special Mask Mode; 
- 
Poll Mode; 
- 
Eol Commands; 


- 
Read Status Commands. 


OCW1 is used solely for masking operations. It pro- 
vides a direct link to the Internal Mask Register 
(IMR). The 80376 can write to this OCW register to 
enable or disable the interrupt inputs. Reading the 
pre-programmed mask can be done via the Interrupt 
Mask Register which will be discussed shortly. 


OCW2 is used to select End-of-Interrupt, Automatic 
Priority Rotation, and Specific Priority Rotation oper-- 
ations. Associated commands and modes of these 
operations are selected using the different combina- 
tions of bits in OCW2. 


Specifically, the OCW2 is used to: 
- 
Designate an interrupt level (0-7) to be used to 
reset a specific ISR bit or to set a specific priori- 
ty. This function can be enabled or disabled; 
- 
Select which software Eol command (if any) is to 
be executed (Le. Non-Specific or Specific Eol); 


- 
Enable one of the priority rotation operations (Le. 
Rotate On Non-Specific Eol, Rotate On Auto- 
matic Eol, or Rotate On Specific Eol). 


There are three main categories of operation that 
OCW3 controls. They are summarized as follows: 
- 
To select and execute the Read Status Register 
Commands, either reading the Interrupt Request 
Register (IRR) or the In-Service Register (ISR); 
- 
To issue the Poll Command. The Poll Command 
will override a Read Register Command if both 
functions are enabled simultaneously; 
- 
To set or reset the Special Mask Mode. 


4.5.3 POll/INTERRUPT 
REQUEST /IN·SERVICE 
STATUS 
REGISTER 


As the name implies, this 8-bit read-only register has 
multiple functions. Depending on the command is- 
sued in the OCW3, the content of this register re- 
flects the result of the command executed. For a 
Poll Command, the register read contains the binary 
code of the highest priority level requesting service 
(if any). For a Read IRR Command, the register con- 
tent will show the current pending interrupt re- 
quest(s). Finally, for a Read ISR Command, this reg- 
ister will specify all interrupt levels which are being 
serviced. 


This is a read-only 8-bit register which, when read, 
will specify all interrupt levels within the same bank 
that are masked. 


Each interrupt request input has an 8·bit read/write 
programmable vector register associated with it. The 
registers should be programmed to contain the inter- 
rupt vector for the corresponding request. The con- 
tents of the Vector Register will be placed on the 
Data Bus during the INTA cycles as described previ- 
ously. 


4.6 
Programming 


Programming the 82370 PIC is accomplished by us- 
ing two types of 
command words: 
ICW's and 
OCW's. All modes and commands explained in the 
previous 
sections 
are 
programmable 
using the 
ICW's and OCW's. The ICW's are issued from the 
80376 in a sequential format and are used to setup 
the banks in the 82370 PIC in an initial state of oper- 
ation. The OCW's are issued as needed to vary and 
control the 82370 PIC's operations. 


Both ICW's and OCW's are sent by the 80376 to the 
interrupt banks via the Data Bus. Each bank distin- 
guishes between the different ICW's and OCW's by 
the I/O address map, the sequence they are issued 
(ICW's only), and by some dedicated bits among the 
ICW's and OCW's. 


An example of programming the 82370 interrupt 
controllers is given in Appendix C (Programming the 
82370.Interrupt Controllers). 


All three interrupt banks are programmed in a similar 
way. Therefore, only a single bank will be described 
in the following sections. 


Before normal operation can begin, each bank must 
be initialized by programming a sequence of two to 
four bytes written into the ICW's. 


Figure 4-6 shows the initialization flow for an inter- 
rupt bank. Both ICW1 and ICW2 must be issued for 
any form of operation. However, ICW3 and ICW4 are 
used only if designated in ICW1. Once initialized, if 
any programming changes within the ICW's are to 
be made, the entire ICW sequence must be repro- 
grammed, not just an individual ICW. 


Note that although the ICW2's in the 82370 PIC do 
not effect the Bank's operation, they still must be 
programmed in order to preserve the compatibility 
with the 82C59A. The contents programmed are not 
relevant to the overall operations of the interrupt 
banks. Also, whenever one of the three ICW2's is 
programmed, an interrupt level 1.5 in Bank A will be 
generated. This interrupt request will be cleared 
upon reading of the ICW2 registers. Since the three 
ICW2's share the same interrupt level and the sys- 
tem may not know the origin of the interrupt, all three 
ICW2's must be read. 


·'CW2 
vector 
address 
must be programmed 
now. 


Other 
vector 
addresses 
may be programmed 
via ICW2 interrupt 
service 
routine. 


Certain internal setup conditions occur automatically 
within the interrupt bank after the first ICW (ICW1) 
has been issued. These are: 
- 
The edge sensitive circuit is reset, which means 
that following initialization, an interrupt request 
input must make a HIGH-to-LOW transition to 
generate an interrupt; 
- 
The Interrupt Mask Register (IMR) is cleared; 
that is, all interrupt inputs are enabled; 


- 
IRQ? input of each bank is assigned priority ? 
(lowest); 
- 
Special Mask Mode is cleared and Status Read 
is set to IRR; 
- 
If no ICW4 is needed, then no Automatic-EOI is 
selected. 


Each interrupt request input has a separate Vector 
Register. These Vector Registers are used to store 
the pre-programmed vector number corresponding 
to their interrupt sources. In order to guarantee prop- 
er interrupt handling, all Vector Registers must be 
programmed with the predefined vector numbers. 
Since an interrupt request will be generated whenev- 
er an ICW2 is written during the initialization se- 
quence, it is important that the Vector Register of 
IRQ1.5 in Bank A should be initialized and the inter- 
rupt service routine of this vector is set up before the 
ICW's are written. 
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4.6.3 ~PERATION 
CONTROL 
WORDS 
(OCW) 


After the ICW's are programmed, the operations of 
each interrupt controller bank can be changed by 
writing into the OCW's as explained before. There is 
no special programming sequence required for the 
OCW's. Any OCW may be written at any time in or- 
der to change the mode of or to perform certain op- 
erations on the interrupt banks. 


4.6.3.1 
Read Status 
and Poll Commands 
(OCW3) 


Since the reading of IRR and ISR status as well as 
the result of a Poll Command are available on the 
same read-only Status Register, a special Read 
Status/Poll Command must be issued before the 
PolI/lnterrupt Request/In-Service Status Register is 
read. This command can be specified by writing the 
required control word into OCW3. As mentioned ear- 
lier, if both the Poll Command and the Status Read 
Command are enabled simultaneously, the 
Poll 
Command will override the Status Read. That is, af- 
ter the command execution, the Status Register will 
contain the result of the Poll Command. 


4.7 Register Bit Definition 


INITIALIZATION 
COMMAND 
WORD 
1 (ICW1) 


Note that for reading IRR and ISR, there is no need 
to issue a Read Status Command to the OCW3 ev- 
ery time the IRR or ISR is to be read. Once a Read 
Status Command is received by the interrupt bank, it 
"remembers" which register is selected. However, 
this is not true when the Poll Command is used. 


In the Poll Command, after the OCW3 is written, the 
82370 PIC treats the next read to the Status Regis- 
ter as an interrupt acknowledge. This will set the ap- 
propriate IS bit if there is a request and read the 
priority level. Interrupt Request input status remains 
unchanged from the Poll Command to the Status 
Read. 


In addition to the above read commands, the Inter- 
rupt Mask Register (IMR) can also be read. When 
read, this register reflects the contents of the pre- 
programmed OCW1 which contains information on 
which interrupt request(s) is(are) currently disabled. 


o - 
EXTERNAL CASCADE 
(ICW3 NEEDED) 
1 - 
NO EXTERNAL CASCADE 
(ICW3 
NOT NEEDED) 


CONTENT IS NOT RELEVANT TO THE ACTUAL 
OPERATION OF THE BANK BUT CAN BE READ 
BY THE INTERRUPT SERVICE ROUTINE TO 
DETERMINE WHERE THE INTERRUPT VECTORS 
OF EACH BANK START. 


o - 
NO SLAVE CASCADED TO BANK A 
1 - 
THERE IS A SLAVE CASCADED 
TO TOUT2#/IRQ3# 
PIN 


o - NO CASCADED REQUEST TO IRQN 
1 - 
THERE IS A CASCADED REQUEST 
CONNECTED TO IRQN (I.E. THE 
CORRESPONDING INTERRUPT 
REQUEST INPUTS) 


o - 
NO CASCADED REQUEST TO IRQN 
1 - 
THERE IS A CASCADED REQUEST 
CONNECTED TO IRQN 
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loll= 1 IolASK SET (INTERRUPT 
DISABLED) 
loll= 0 IolASK RESET 
(INTERRUPT 
ENABLED) 


0 
0 
1 
NON-SPECifiC 
EOI COIolIolAND 
0 
1 
1 
SPECIfiC 
EOI COIolIolAND 
1 
0 
1 
ROTATE ON NON-SPECifiC 
EOI 
1 
0 
0 
ROTATE ON AUTO-EOI 
1ol0DE ~SET) 
0 
0 
0 
ROTATE ON AUTO-EOI 
1ol0DE 
CLEAR) 
1 
1 
1 
ROTATE ON SPECifiC 
EOI (L2-LO 
USED) 
1 
1 
0 
SET PRIORITY 
(L2-LO 
USED) 
0 
1 
0 
NO OPERATION 


ESIolIol Slollol 
o 
0 
NO ACTION 
o 
1 
NO ACTION 
1 
0 
RESET 
SPECIAL 
IolASK 
1 
1 
SET SPECIAL 
IolASK 


RIS 
o 
NO ACTION 
1 
NO ACTION 
o 
READ IR REG. 


1 
READ IS REG. 


ESMM - 
Enable 
Special 
Mask 
Mode. When 
this bit is set to 1, it enables 
the SMM 
bit to set or reset the 
Special 
Mask Mode. When this bit is set to 0, SMM bit becomes 
don't 
care. 


SMM 
- 
Special 
Mask Mode. If ESMM = 1 and SMM = 1, the interrupt 
controller 
bank will enter Special 
Mask 
Mode. If ESMM = 1 and SMM = 0, the bank will revert to normal mask mode. When ESMM = 0, SMM 
has no effect. 
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BINARY CODE Of 
THE HIGHEST PRIORITY 
LEVEL REQUESTING 


.NOTE: 
Although all Interrupt Request inputs are active LOW, the internal logical will invert the state of the pins so that when there 
is a pending interrupt request at the input, the corresponding IRQ bit will be set to HIGH in the Interrupt Request Status 
register. 
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Operational 
Command 
Bits 
Description 
Words 


Fully Nested 
Mode 
OCW-Default 
Non-specific 
EOI Command 
OCW2 
EOI 
Specific 
EOI Command 
OCW2 
SL, EOI, LO-L2 
Automatic 
EOI Mode 
ICW1,ICW4 
IC4,AEOI 
Rotate On Non-Specific 
OCW2 
EOI 
EOI Command 
Rotate On Automatic 
OCW2 
R, SL, EOI 
EOI Mode 
Set Priority Command 
OCW2 
LO-L2 
Rotate On Specific 
OCW2 
R, SL, EOI 
EOI Command 
Interrupt 
Mask Register 
OCW1 
MO-M7 
Special Mask Mode 
OCW3 
ESMM,SMM 
Level Triggered 
Mode 
ICW1 
LTIM 
Edge Triggered 
Mode 
ICW1 
LTIM 
Read Register Command, 
IRR 
OCW3 
RR, RIS 
Read Register Command, 
ISR 
OCW3 
RR, RIS 
Read IMR 
IMR 
MO-M7 
Poll Command 
OCW3 
P 
Special 
Fully Nested 
Mode 
ICW1,ICW4 
IC4, SFNM 


For ease of reference, 
Table 4-4 gives a summary 
of 
the different 
operating 
modes 
and commands 
with 
their corresponding 
registers. 


5.0 
PROGRAMMABLE 
INTERVAL 
TIMER 


5.1 
Functional 
Description 


The 82370 contains 
four independently 
Programma- 
ble Interval 
Timers: 
Timer 
0-3. 
All four timers 
are 
functionally 
compatible 
to the Intel 82C54. 
The first 
three 
timers 
(Timer 
0-2) 
have 
specific 
functions. 


The fourth timer, Timer 3, is a general purpose 
timer. 


Table 5-1 depicts the functions 
of each timer. A brief 
description 
of each timer's 
function 
follows. 


Table 5-1. Programmable 
Interval 
Timer Functions 


trimer 
Output 
Function 


0 
IR08 
Event Based IR08 Generato 
1 
TOUT1/REF# 
Gen. Purpose/DRAM 
Refresh 
Req. 
2 
TOUT2/IR03# 
Gen. Purpose/Speaker 
OutlIR03# 
3 
TOUT3# 
Gen. Purpose/I 
ROO 
Generator 


TIMER o-Event 
Based Interrupt 
Request 
8 
Generator 


Timer 0 is intended 
to be used as an Event Counter. 


The 
output 
of this timer 
will generate 
an Interrupt 
Request 
8 (IR08) 
upon 
a rising edge 
of the timer 
output 
(TOUTO). Normally, 
this timer 
is used to im- 


plement 
a time-of-day 
clock or system tick. The Tim- 
er 0 output 
is not available 
as an external 
signal. 


TIMER 
1-General 
Purpose/DRAM 
Refresh 
Request 


The output 
of Timer 
1, TOUT1, 
can be used as a 
general 
purpose 
timer 
or as a DRAM 
Refresh 
Re- 


quest signal. The rising edge of this output creates 
a 
DRAM refresh 
request 
to the 82370 
DRAM Refresh 
Controller. 
Upon 
reset, 
the 
Refresh 
Request 
func- 
tion 
is disabled, 
and the output 
pin is the Timer 
1 


output. 


The Timer 2 output, TOUT2 #, could be used to sup- 
port tone generation 
to an external 
speaker. 
This pin 
is a bidirectional 
signal. 
When 
used as an input, 
a 
logic LOW asserted 
at this pin will generate 
an Inter- 
rupt Request 
3 (IR03#) 
(see Programmable 
Inter- 


rupt Controller). 
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8- 
BIT 
• 
INTERNAL 
BUS 


DATA 
BUFFER 
&c 


lOGIC 


CONTROL 
WORD 
REGISTER 
I 


CONTROL 
WORD 
REGISTER 
II 


REF ENABLE 
(INTERNAL) 


TOUT2#/IRQ3# 


TIMER 
3-General 
Purpose/Interrupt 
Request 
0 
Generator 


The output of Timer 3 is fed to an edge detector and 
generates an Interrupt Request 0 (IROO) in the 
82370. The inverted output of this timer (TOUT3#) 
is also available as an external signal for general 
purpose use. 


The functional block diagram of the Programmable 
Interval Timer section is shown in Figure 5-1. Follow- 
ing is a description of each block. 


This part of the Programmable Interval Timer is used 
to interface the four timers to the 82370 internal bus. 
The Data Buffer is for transferring commands and 
data between the 8-bit internal bus and the timers. 


The Read/Write Logic accepts inputs from the inter- 
nal bus and generates signals to control other func- 
tional blocks within the timer section. 


The Control Word Registers are write-only registers. 
They are used to control the operating modes of the 
timers. Control Word Register I controls Timers 0, 1 
and 2, and Control Word Register II controls Timer 
3. Detailed description of the Control Word Regis- 
ters will be included in the Register Set Overview 
section. 


COUNTER 
0, COUNTER 
1, COUNTER 
2, 
COUNTER 
3 


Counters 0, 1, 2, and 3 are the major parts of Timers 
0, 1, 2, and 3, respectively. These four functional 
blocks are identical in operation, so only a single 
counter will be described. The internal block dia- 
gram of one counter is shown in Figure 5-2. 
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The four counters share a common clock input 
(CLKIN), but otherwise are fully independent. Each 
counter is programmable to operate in a different 
mode. 


Although the Control Word Register is shown in the 
figure, it is not part of the counter itself. Its pro- 
grammed contents are used to control the opera- 
tions of the counters. 


The Status Register, when latched, contains the cur- 
rent contents of the Control Word Register and 
status of the output and Null Count Flag (see Read 
Back Command). 


The Counting Element (CE) is the actual counter. It 
is a 16-bit presettable synchronous down counter. 


The Output Latches (OL) contain two 8-bit latches 
(OLM and OLL). Normally, these latches "follow" 
the content of the CEoOLM contains the most signif- 
icant byte of the counter and OLL contains the least 
significant byte. If the Counter Latch Command is 
sent to the counter, OL will latch the present count 
until read by the 80376 and then return to follow the 
CEoOne latch at a time is enabled by the timer's 
Control Logic to drive the internal bus. This is how 
the 16·bit Counter communicates over the 8·bit in- 
ternal bus. Note that CE cannot be read. Whenever 
the count is read, it is one of the OL's that is being 
read. 


When a new count is written into the counter, the 
value will be stored in the Count Registers (CR), and 
transferred to CEoThe transferring of the contents 
from CR's to CE is defined as "loading" of the coun- 
ter. The Count Register contains two 8-bit registers: 
CRM (which contains the most significant byte) and 
CRL (which contains the least significant byte). Simi- 
lar to the OL's, the Control Logic allows one register 
at a time to be loaded from the 8-bit internal bus. 
However, both bytes are transferred from the CR's 
to the CE simultaneously. Both CR's are cleared 
when the Counter is programmed. This way, if the 
Counter has been programmed for one byte count 
(either the most significant or the least significant 
byte only), the other byte will be zero. Note that CE 
cannot be written into directly. Whenever a count is 
written, it is the CR that is being written. 


As shown in the diagram, the Control Logic consists 
of three signals: CLKIN, GATE, and OUT. CLKIN 
and GATE will be discussed in detail in the section 
that follows. OUT is the internal output of the coun- 
ter. The external outputs of some timers (TOUT) are 
the inverted version of OUT (see TOUT1, TOUT2#, 
TOUT3#). The state of OUT depends on the mode 
of operation of the timer. 


CLKIN is an input signal used by all four timers for 
internal timing reference. This signal can be inde- 
pendent of the 82370 system clock, CLK2. In the 
following discussion, each "CLK Pulse" is defined 
as the time period between a rising edge and a fail- 
ing edge, in that order, of CLKIN. 


During the rising edge of CLKIN, the state of GATE 
is sampled. All new counts are loaded and counters 
are decremented on the falling edge of CLKIN. 


5.2.2 TOUT1, TOUTU, 
TOUT3# 


TOUT1, TOUT2# 
and TOUT3# 
are the external 
output signals of Timer 1, Timer 2 and Timer 3, reo 
spectively. TOUT2# and TOUT3# are the inverted 
signals of their respective counter outputs, OUT. 
There is no external output for Timer O. 


If Timer 2 is to be used as a tone generator of a 
speaker, external buffering must be used to provide 
sufficient drive capability. 


The Outputs of Timer 2 and 3 are dual function pins. 
The output pin of Timer 2 (TOUT2# /IR03#), 
which 


is a bidirectional open-collector signal, can also be 
used as interrupt request input. When the interrupt 
function is enabled (through the Programmable In- 
terrupt Controller), a LOW on this input will generate 
an Interrupt Request 3# to the 82370 Programma- 
ble Interrupt Controller. This pin has a weak internal 
pull-up resistor. To use the IR03# function, Timer 2 
should be programmed so that OUT2 is LOW. Addi- 
tionally, OUT3 of Timer 3 is connected to an edge 
detector which will generate an Interrupt Request 0 
(IROO)to the 82370 after the rising edge of OUT3 
(see Figure 5-1). 


GATE is not an externally controllable signal. Rath- 
er, it can be software controlled with the Internal 
Control Port. The state of GATE is always sampled 
on the rising edge of CLKIN. Depending on the 
mode of operation, GATE is used to enable/disable 
counting or trigger the start of an operation. 


For Timer 0 and 1, GATE is always enabled (HIGH). 
For Timer 2 and 3, GATE is connected to Bit 0 and 
6, respectively, of an Internal Control Port (at ad- 
dress 61H) of the 82370. After a hardware reset, the 
state of GATE of Timer 2 and 3 is disabled (LOW). 


Each timer can be independently programmed to 
operate in one of six different modes. Timers are 
programmed by writing a Control Word into the Con- 
trol Word Register followed by an Initial Count (see 
Programming). 


The following are defined for use in describing the 
different modes of operation. 


CLK Pulse- 
A rising edge, then a falling edge, in 
that order, of CLKIN. 
Trigger- 
A rising edge of a timer's GATE input. 


Timer/Counter Loading- 
The transfer of a count 
from 
Count 
Register 
(CR) to Count Element 
(CE). 


5.3.1 MODE O-INTERRUPT ON TERMINAL 
COUNT 


Mode 0 is typically used for event counting. After the 
Control Word is written, OUT is initially LOW. and will 
remain LOW until the counter reaches zero. OUT 
then goes HIGH and remains HIGH until a new 
count or a new Mode 0 Control Word is written into 
the counter. 


In this 
mode, 
GATE=HIGH enables 
counting; 


GATE = LOW disables counting. However, GATE 
has no effect on OUT. 


After the Control Word and initial count are written to 
a timer, the initial count will be loaded on the next 
CLK pulse. This CLK pulse does not decrement the 
count, so for an initial count of N, OUT does not go 
HIGH until N+ 1 CLK pulses after the initial count is 
written. 


If a new count is written to the timer, it will be loaded 
on the next CLK pulse and counting will continue 
from the new count. If a two-by1e count is written, 
the following happens: 


1. Writing the first by1edisables counting, OUT is set 
LOW immediately (I.e. no CLK pulse required). 


2. Writing the second by1eallows the new count to 
be loaded on the next CLK pulse. 


This allows the counting sequence to be synchroniz- 
ed by software. Again, OUT does not go HIGH until 
N+ 1 CLK pulses after the new count of N is written. 
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NOTES: 
The following conventions apply to all mode timing diagrams. 
1. Counters are programmed for binary (not BCD) counting and for reading/writing least significant byte (LSB) only. 
2. The counter is always selected (CS11always low). 
3. CW stands for "Control Word"; CW = 10 means a control word of 10, Hex is written to the counter. 
4. LSB stands for "Least significant byte" of count. 
5. Numbers below diagrams are count values. 
The lower number is the least significant byte. 
The upper number is the most significant byte. Since the counter is programmed to read/write LSB only, the most 
significant byte cannot be read. 
N stands for an undefined count. 
Vertical lines show transitions between count values. 


If an initial count is written while GATE is lOW, the 
counter will be loaded on the next ClK pulse. When 
GATE goes HIGH, OUT will go HIGH N ClK pulses 
later; no ClK pulse is needed to load the counter as 
this has already been done. 


one-shot operation. The OUT signal will then remain 
lOW until the timer reaches zero. At this point, OUT 
will stay HIGH until the next trigger comes in. Since 
the state of GATE signals of Timer 0 and 1 are inter- 
nally set to HIGH. 


5.3.2 MODE 
1-GATE 
RETRIGGERABLE 
ONE·SHOT 


After writing the Control Word and initial count, the 
timer is considered "armed". 
A trigger results in 
loading the timer and setting OUT lOW on the next 
ClK pulse. Therefore, an initial count of N will result 
in a one-shot pulse width of N ClK 
cycles. Note 
In this mode, OUT will be initially HIGH. OUT will go 
lOW on the ClK pulse following a trigger to start the 
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that this one-shot operation is retriggerable; i.e. OUT 
will remain lOW for N ClK pulses after every trigger. 
The one-shot operation can be repeated without re- 
writing the same count into the timer. 


If a new count is written to the timer during a one- 
shot operation, the current one-shot pulse width will 
not be affected until the timer is retriggered. This is 
because loading of the new count to CE will occur 
only when the one-shot is triggered. 


This mode is a divide-by-N counter. It is typically 
used to generate a Real Time Clock interrupt. OUT 
will initially be HIGH. When the initial count has dec- 


remented to 1, OUT goes lOW for one ClK pulse, 
then OUT goes HIGH again. Then the timer reloads 
the initial count and the process is repeated. In other 
words, this mode is periodic since the same se- 
quence is repeated itself indefinitely. For an initial 
count of N, the sequence repeats every N ClK cy- 
cles. 


Similar to Mode 0, GATE= HIGH enables counting, 
where GATE= lOW 
disables counting. If GATE 
goes lOW during an output pulse (lOW), OUT is set 
HIGH immediately. A trigger (rising edge on GATE) 
will reload the timer with the initial count on the next 
ClK 
pulse. Then, OUT will go lOW (for one ClK 
pulse) N ClK 
pulses after the new trigger. Thus, 


GATE can be used to synchronize the timer. 
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NOTE: 
A GATE transition 
should 
not occur 
one clock 
prior to terminal 
count 


After writing a Control Word and initial count, the 
timer will be loaded on the next ClK pulse. OUT 
goes lOW (for one ClK pulse) N ClK pulses after 
the initial count is written. This is another way the 
timer may be synchronized by software. 


new count but before the end of the current period, 
the timer will be loaded with the new count on the 
next ClK pulse after the trigger, and counting will 
continue with the new count. 


Writing a new count while counting does not affect 
the current counting sequence because the new 
count will not be loaded until the end of the current 
counting cycle. If a trigger is received after writing a 


Mode 3 is typically used for Baud Rate generation. 
Functionally, this mode is similar to Mode 2 except 


for the duty cycle of OUT. In this mode, OUT will be 
initially HIGH. When half of the initial count has ex- 
pired, OUT goes low for the remainder of the count. 
The counting sequence will be repeated, thus this 
mode is also periodic. Note that an initial count of N 
results in a square wave with a period of N ClK 
pulses. 


The GATE input can be used to synchronize the tim- 
er. GATE=HIGH 
enables counting; GATE=lOW 
disables counting. If GATE goes lOW while OUT is 
lOW, OUT is set HIGH immediately (i.e. no ClK 
pulse is required). A trigger reloads the timer with the 
initial count on the next ClK pulse. 


After writing a Control Word and initial count, the 
timer will be loaded on the next ClK pulse. This al- 
lows the timer to be synchronized by software. 


Writing a new count while counting does not affect 
the current counting sequence. If a trigger is re- 
ceived after writing a new count but before the end 
of the current half-cycle of the square wave, the tim- 
er will be loaded with the new count on the next ClK 
pulse and counting will continue from the new count. 
Otherwise, the new count will be loaded at the end 
of the current half-cycle. 


There is a slight difference in operation depending 
on whether the initial count is EVEN or ODD. The 
following description is to show exactly how this 
mode is implemented. 


OUT is initially HIGH. The initial count is loaded on 
one ClK pulse and is decremented by two on suc- 
ceeding ClK pulses. When the count expires (decre- 
mented to 2), OUT changes to lOW and the timer is 
reloaded with the initial count. The above process is 
repeated indefinitely. 


OUT is initially HIGH. The initial count minus one 
(which is an even number) is loaded on one ClK 
pulse and is decremented by two on succeeding 
ClK pulses. One ClK pulse after the count expires 
(decremented to 2), OUT goes lOW and the timer is 
loaded with the initial count minus one again. Suc- 
ceeding ClK pulses decrement the count by two. 
When the count expires, OUT goes HIGH immedi- 
ately and the timer is reloaded with the initial count 
minus one. The above process is repeated indefi- 
nitely. So for ODD counts, OUT will 
HIGH or 
(N+ 1)/2 counts and lOW for (N-1 )/2 counts. 


inter 


CW.tl 
Lse•• 
WRITE LJU---------------- 


CW.1. 
lSI_S 
WRITE LJU---------------- 


CW.1. 
LSB.4 
WRITE ~---------------- 


NOTE: 
A GATE transition 
should 
not occur 
one clock 
prior to terminal 
count. 


5.3.5 MODE 4-INITIAL 
COUNT TRIGGERED 
STROBE 
After writing the Control Word and initial count, the 
timer will be loaded on the next ClK pulse. This ClK 
pulse does not decrement the count, so for an initial 
count of N, OUT does not strobe lOW until N+ 1 
ClK pulses after initial count is written. 
This mode allows a strobe pulse to be generated by 
writing an initial count to the timer. Initially, OUT will 
be HIGH. When a new initial count is written into the 
timer, the counting sequence will begin. When the 
initial count expires (decremented to 1), OUT will go 
lOW for one ClK pulse and then go HIGH again. 


Again, 
GATE= HIGH 
enables 
counting 
while 
GATE = lOW disables counting. GATE has no ef- 
fect on OUT. 


If a new count is written during counting, it will be 
loaded in the next ClK pulse and counting will con- 
tinue from the new count. 
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1. Writing the first byte has no effect on counting. 


2. Writing the second byte allows the new count to 
be loaded on the next ClK pulse. 


OUT will strobe lOW 
N+ 1 ClK 
pulses after the 
new count of N is written. Therefore, when the 
strobe pulse will occur after a trigger depends on the 
value of the initial count loaded. 


5.3.6 MODE 5-GATE 
RETRIGGERABLE 
STROBE 


Mode 5 is very similar to Mode 4 except the count 
sequence is triggered by the gate signal instead of 


4-203 


by writing an initial count. Initially, OUT will be HIGH. 
Counting is triggered by a rising edge of GATE. 
When the initial count has expired (decremented to 
1), OUT will go lOW for one ClK pulse and then go 
HIGH again. 


After loading the Control Word and initial count, the 
Count Element will not be loaded until the ClK pulse 
after a trigger. This ClK pulse does not decrement 
the count. Therefore, for an initial count of N, OUT 
does not strobe lOW until N+ 1 ClK pulses after a 
trigger. 
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The counting sequence is retriggerable. Every trig- 
ger will result in the timer being loaded with the initial 
count on the next ClK pulse. 


If the new count is written during counting, the cur- 
rent counting sequence will not be affected. If a trig- 
ger occurs after the new count is written but before 
the current count expires, the timer will be loaded 
with the new count on the next ClK pulse and a new 
count sequence will start from there. 


5.3.7.1 GATE 


The GATE input is always sampled on the rising 
edge of ClKIN. In Modes 0, 2, 3 and 4, the GATE 
input is level sensitive. The logic level is sampled on 
the rising edge of ClKIN. In Modes 1, 2, 3 and 5, the 
GATE input is rising edge sensitive. In these modes, 
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Mode 
GATE LOW or Going LOW 
GATE Rising 
HIGH 


0 
Disable count 
No Effect 
Enable count 
1 
No Effect 
1. Initiate count 
No Effect 
2. Reset output 
after next clock 
2 
1. Disable count 
Initiate count 
Enable count 
2. Sets output HIGH 
immediately 
3 
1. Disable count 
Initiate count 
Enable count 
2. Sets output HIGH 


immediately 
4 
Disable count 
No Effect 
Enable count 
5 
No Effect 
Initiate count 
No Effect 


a rising edge of GATE (trigger) sets an edge sensi- 
tive flip-flop in the timer. The flip-flop is reset imme- 
diately after it is sampled. This way, a trigger will be 
detected no matter when it occurs; i.e. a HIGH logic 
level does not have to be maintained until the next 
rising edge of CLKIN. Note that in Modes 2 and 3, 
the GATE input is both edge and level sensitive. 


New counts are loaded and counters are decre- 
mented on the falling edge of CLKIN. The largest 
possible initial count is O. This is equivalent to 2··16 
for binary counting and 10·'4 for BCD counting. 


Note that the counter does not stop when it reaches 
zero. In Modes 0, 1, 4 and 5, the counter 'wraps 
around' to the highest count: either FFFF Hex for 
binary counting or 9999 for BCD counting, and con- 
tinues counting. Modes 2 and 3 are periodic. The 
counter reloads itself with the initial count and con- 
tinues counting from there. 


The minimum and maximum initial count in each 
counter depends on the mode of operation. They 
are summarized below. 


Mode 
Mln 
Max 


0 
1 
0 
1 
1 
0 
2 
2 
0 
3 
2 
0 
4 
1 
0 
5 
1 
0 


5.4 
Register Set Overview 


The Programmable Interval Timer module of the 
82370 contains a set of six registers. The port ad- 
dress map of these registers is shown in Table 5-2. 


Port Address 
Description 


40H 
Counter 0 Register (read/write) 
41H 
Counter 1 Register (read/write) 
42H 
Counter 2 Register (read/write) 
43H 
Control Word Register I 
(Counter 0, 1 & 2) (write-only) 


44H 
Counter 3 Register (read/write) 
45H 
Reserved 
46H 
Reserved 
47H 
Control Word Register II 
(Counter 3) (write-only) 


These four 8-bit registers are functionally identical. 
They are used to write the initial count value into the 
respective timer. Also, they can be used to read the 
latched count value of a timer. Since they are 8-bit 
registers, reading and writing of the 16-bit initial 
count must follow the count format specified in the 
Control Word Registers; i.e. least significant byte 
only, most significant byte only, or least significant 
byte then most significant byte (see Programming). 


There are two Control Word Registers associated 
with the Timer section. One of the two registers 
(Control Word Register I) is used to control the oper- 
ations of Counters 0, 1 and 2 and the other (Control 
Word Register II) is for Counter 3. The major func- 
tions of both Control Word Registers are listed be- 
low: 
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- 
Select 
the timer to be programmed. 


- 
Define which mode the selected 
timer is to oper- 
ate in. 


- 
Define 
the count 
sequence; 
Le. if the selected 
timer is to count as a Binary Counter 
or a Binary 
Coded 
Decimal 
(BCD) Counter. 


- 
Select 
the 
byte 
access 
sequence 
during 
timer 
read/write 
operations; 
Le. least 
significant 
byte 
only, 
most 
significant 
only, 
or least 
-significant 
byte first, then most significant 
byte. 


Also, 
the 
Control 
Word 
Registers 
can 
be 
pro- 
grammed 
to perform a Counter 
Latch Command 
or a 
Read Back Command 
which will be described 
later. 


Upon 
power-up 
or reset, 
the 
state 
of all timers 
is 
undefined. 
The mode, count value, and output of all 
timers 
are random. 
From 
this 
point 
on, how 
each 
timer operates 
is determined 
solely by how it is pro- 
grammed. 
Each timer must be programmed 
before it 
can be used. Since the outputs 
of some timers 
can 
generate 
interrupt 
signals 
to the 82370, 
all timers 
should 
be initialized 
to a known 
state. 


Counters 
are programmed 
by writing a Control Word 
into their respective 
Control 
Word 
Registers. 
Then, 
an Initial Count 
can be written 
into the correspond- 
ing Count Register. 
In general, the programming 
pro- 
cedure is very flexible. 
Only two conventions 
need to 
be remembered: 


1. For each timer, the Control 
Word must be written 
before the initial count is written. 


2. The 16-bit initial count 
must follow 
the count for- 
mat specified 
in the Control 
Word 
(least 
significant 
byte only, most significant 
byte only, or least signifi- 
cant byte first, followed 
by most significant 
byte). 


Since the two Control 
Word 
Registers 
and the four 
Counter 
Registers 
have 
separate 
addresses, 
and 
each timer can be individually 
selected 
by the appro- 
priate Control 
Word 
Register, 
no special 
instruction 
sequence 
is required. 
Any 
programming 
sequence 
that follows 
the conventions 
above 
is acceptable. 


A new initial count 
may be written 
to a timer at any 
time without 
affecting 
the timer's 
programmed 
mode 
in any way. Count sequence 
will be affected 
as de- 
scribed 
in the Modes of Operation 
section. 
Note that 
the 
new count 
must follow 
the programmed 
count 
format. 


If a timer 
is previously 
programmed 
to read/write 
two-byte 
counts, 
the following 
precaution 
applies. 
A 
program 
must 
not transfer 
control 
between 
writing 
the first and second 
byte to another 
routine 
which 
also writes into the same timer. Otherwise, 
the read/ 
write will result in incorrect 
count. 


Whenever 
a Control 
Word 
is written 
to a timer, 
all 
control 
logic 
for that 
timer(s) 
is immediately 
reset 
(Le. no CLK pulse is required). 
Also, the correspond- 


_ing output 
in, TOUT #, goes to a known 
initial state. 


Three 
methods 
are 
available 
to 
read 
the 
current 
count as well as the status of each timer. They are: 
Read Counter 
Registers, 
Counter 
Latch 
Command 
and Read Back Command. 
Below is a description 
of 
these 
methods. 


The current count of a timer can be read by perform- 
ing a read operation 
on the corresponding 
Counter 
Register. 
The only restriction 
of this read operation 
is that the CLKIN of the timers 
must be inhibited 
by 
using external 
logic. Otherwise, 
the count may be in 
the process 
of changing 
when 
it is read, giving 
an 
undefined 
result. 
Note that since all four timers 
are 
sharing 
the same CLKIN 
signal, 
inhibiting 
CLKIN 
to 
read a timer will unavoidably 
disable the other timers 
also. This may prove to be impractical. 
Therefore, 
it 
is suggested 
that 
either 
the 
Counter 
Latch 
Com- 
mand or the Read Back Command 
can be used to 
read the current 
count 
of a timer. 


Another 
alternative 
is to temporarily 
disable 
a timer 
before 
reading 
its Counter 
Register 
by using 
the 
GATE 
input. 
Depending 
on the mode 
of operation, 
GATE = LOW 
will 
disable 
the 
counting 
operation. 
However, 
this option 
is available 
on Timer 
2 and 3 
only, since the GATE signals of the other two timers 
are internally 
enabled 
all the time. 


A Counter 
Latch Command 
will be executed 
when- 
ever a special Control 
Word is written 
into a Control 
Word 
Register. 
Two 
bits 
written 
into 
the 
Control 
Word Register 
distinguish 
this command 
from a 'reg- 
ular' Control Word (see Register 
Bit Definition). 
Also, 
two other 
bits in the Control 
Word will select 
which 
counter 
is to be latched. 


Upon 
execution 
of 
this 
command, 
the 
selected 
counter's 
Output 
Latch (OL) latches 
the count at the 
time the Counter 
Latch 
Command 
is received. 
This 
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count is held in the latch until it is read by the 60376, 
or until the timer is reprogrammed. 
The count is then 
unlatched 
automatically 
and the OL returns 
to "fol- 
lowing" 
the 
Counting 
Element 
(CE). 
This 
allows 
reading 
the contents 
of the counters 
"on 
the fly" 
without 
affecting 
counting 
in 
progress. 
Multiple 
Counter 
Latch 
Commands 
may 
be 
used 
to 
latch 
more than one counter. 
Each latched 
count 
is held 
until it is read. Counter 
Latch Commands 
do not af- 
fect the programmed 
mode of the timer in any way. 


If a counter 
is latched, 
and at some time later, it is 


latched 
again before the prior latched 
count is read, 


the second 
Counter 
Latch Command 
is ignored. The 
count read will then be the count at the time the first 
command 
was issued. 


In any event, 
the 
latched 
count 
must 
be read 
ac- 


cording 
to the 
programmed 
format. 
Specifically, 
if 
the timer 
is programmed 
for two-byte 
counts, 
two 
bytes must be read. However, 
the two bytes do not 
have to be read right after the other. 
Read/write 
or 
programming 
operations 
of other timers may be per- 


formed 
between 
them. 


Another 
feature 
of this Counter 
Latch 
Command 
is 
that 
read 
and 
write 
operations 
of the 
same 
timer 
may be interleaved. 
For example, 
if the timer is pro- 
grammed 
for 
two-byte 
counts, 
the 
following 
se- 


. quence 
is valid. 


1. Read least significant 
byte. 


2. Write new least significant 
byte. 


3. Read most significant 
byte. 


4. Write new most significant 
byte. 


If a timer 
is programmed 
to 
read/write 
two-byte 
counts, 
the following 
precaution 
applies. 
A program 
must 
not transfer 
control 
between 
reading 
the first 
and second 
byte to another 
routine which also reads 
from that same timer. Otherwise, 
an incorrect 
count 
will be read. 


The Read Back Command 
is another 
special 
Com- 
mand Word operation 
which 
allows the user to read 
the current 
count value and/or 
the status of the se- 
lected 
timer(s). 
Like the 
Counter 
Latch 
Command, 


two 
bits 
in the 
Command 
Word 
identify 
this 
as a 
Read Back Command 
(see Register 
Bit Definition). 


The 
Read 
Back 
Command 
may be used 
to 
latch 
multiple 
counter 
Output 
Latches 
(OL's) 
by selecting 
more than one timer within 
a Command 
Word. This 
single command 
is functionally 
equivalent 
to several 
Counter 
Latch Commands, 
one for each counter 
to 


be 
latched. 
Each 
counter's 
latched 
count 
will 
be 
held until it is read by the 60376 or until the timer is 
reprogrammed. 
The 
counter 
is 
automatically 
un- 


latched 
when 
read, 
but 
other 
counters 
remain 
latched 
until they 
are read. 
If multiple 
Read 
Back 
commands 
are 
issued 
to the 
same 
timer 
without 


reading the count, all but the first are ignored; 
Le. the 
count 
read 
will 
correspond 
to the 
very 
first 
Read 
Back Command 
issued. 


As mentioned 
previously, 
the Read Back Command 
may also be used to latch status 
information 
of the 
selected 
timer(s). 
When this function 
is enabled, 
the 
status of a timer can be read from the Counter 
Reg- 
ister after the Read Back Command 
is issued. 
The 
status information 
of a timer includes 
the following: 


1. Mode of timer: 


This allows the user to check the mode of opera- 
tion of the timer last programmed. 


2. State of TOUT pin of the timer: 


This allows the user to monitor 
the counter's 
out- 


put 
pin via 
software, 
possibly 
eliminating 
some 
hardware 
from a system. 


3. Null Count/Count 
available: 


The Null Count 
Bit in the status 
byte indicates 
if 
the last count 
written 
to the Count 
Register 
(CR) 


has been loaded 
into the Counting 
Element 
(CE) . 
The 
exact 
time 
this 
happens 
depends 
on 
the 
mode 
of the 
timer 
and 
is described 
in the 
Pro- 
gramming 
section. 
Until the count 
is loaded 
into 
the Counting 
Element 
(CE), it cannot 
be read from 
the timer. 
If the count 
is latched 
or read 
before 
this occurs, 
the 
count 
value 
will 
not 
reflect 
the 
new count just written. 


If multiple 
status latch operations 
of the timer(s) 
are 
performed 
without 
reading the status, all but the first 
command 
are ignored; 
Le. the status read in will cor- 
respond 
to the first Read Back Command 
issued. 


Both the current 
count 
and status 
of the 
selected 
timer(s) 
may be latched 
simultaneously 
by enabling 
both 
functions 
in a single 
Read 
Back 
Command. 


This is functionally 
the same as issuing two separate 
Read Back Commands 
at once. Once again, if multi- 
ple read 
commands 
are 
issued 
to 
latch 
both 
the 
count and status of a timer, all but the first command 
will be ignored. 


If both count 
and status 
of a timer are latched, 
the 
first 
read 
operation 
of 
that 
timer 
will 
return 
the 
latched 
status, regardless 
of which was latched first. 


The next one or two 
(if two count 
bytes 
are to be 
read) read operations 
return the latched 
count. 
Note 
that 
subsequent 
read 
operations 
on 
the 
Counter 
Register 
will return the unlatched 
count (like the first 
read method 
discussed). 


5.6 
Register Bit Definitions 


COUNTER 0, 1, 2, 3 REGISTER (READ/WRITE) 


Port Address 
Description 


40H 
Counter 0 Register 
(read/write) 
41H 
Counter 
1 Register 
(read/write) 
42H 
Counter 2 Register 
(read/write) 
44H 
Counter 3 Register 
(read/write) 
45H 
Reserved 
46H 
Reserved 


Note 
that 
these 
8-bit 
registers 
are for writing 
and 
reading 
of one byte of the 16-bit count value, either 


the most significant 
or the least significant 
byte. 


CONTROL WORD REGISTER I 
& 
II (WRITE- 
ONLY) 


Port Address 
Description 


43H 
Control Word Register 
I 
(Counter 0, 1, 2 (write-only) 
47H 
Control Word Register 
II 
(Counter 3) (write-only) 


~ 


LSB 
OF COUNT 
BYTE 


MSB 
OF 
COUNT 
BYTE 


SELECT 
COUNTER: 


00 
SELECT 
COUNTER 
0 
01 
SELECT 
COUNTER 
1 
10 
SELECT 
COUNTER 
2 
11 
REAO BACK 
COM"'AND 
FOR COUNTER 
0-2 


0- 
16-BIT 
BINARY 
COUNTER 


1 - 
BCD COUNTER 
(. 
DECADES) 


READ/WRITE: 


00 
COUNTER 
LATCH 
COM"'AND 
01 
READ/WRITE 
LSB 
BYTE ONLY 
10 
READ/WRITE 
MSB 
BYTE ONLY 
11 
READ/WRITE 
LSB, 
THEN 
MSB 
BYTE 


"'ODE: 
000 
MODE 0 
001 
MODE 
1 
X10 
MODE 2 
Xll 
"'ODE 
3 
100 
"'ODE. 
101 
MODE S 
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SELECT 
COUNTER: 


00 
SELECT 
COUNTER 
3 
01 
RESERVED 
10 
RESERVED 
11 
READ BACK 
CO••••••AND 
FOR COUNTER 
3 


0-16-BIT 
BINARY 
COUNTER 


1 - BCD COUNTER 


(. 
DECADES) 


READ/WRITE: 


00 
COUNTER 
LATCH 
COM"'AND 
01 
READ/WRITE 
LSB 
BYTE ONLY 
10 
READ/WRITE 
MSB 
BYTE ONLY 
11 
READ/WRITE 
LSB, 
THEN 
MSB 
BYTE 


"'ODE: 


000 
MODE 0 
001 
MODE 
1 
Xl0MODE2 
X11 
MODE 3 


100 
"'ODE 
• 


101 
"'ODE 
S 


290164-77 


00 
COUNTER 
0 (OR 
3) 
01 
COUNTER 
1 
10 
COUNTER 
2 
11 
READ BACK 
COMMAND 


0- 
LATCH 
COUNT 
1 - 
DO NOT LATCH 
COUNT 


0- 
LATCH 
STATUS 
1 - 
DO NOT LATCH 
STATUS 


0- 
COUNTER 
NOT 
SELECTED 


1 - 
COUNTER 
IS 
SELECTED 


0- 
COUNT 
AVAILABLE 
FOR READING 
1 - 
NULL 
COUNT 


The 82370 contains a programmable Wait State 
Generator which can generate a pre-programmed 
number of wait states during both CPU and DMA 
initiated bus cycles. This Wait State Generator is ca- 
pable of generating 1 to 16 wait states in non-pipe- 
lined mode, and 0 to 15 wait states in pipelined 
mode. Depending on the bus cycle type and the two 
Wait State Control inputs (WSC 0-1), 
a pre-pro- 
grammed number of wait states in the selected Wait 
State Register will be generated. 


The Wait State Generator can also be disabled to 
allow the use of devices capable of generating their 
own READYif signals. Figure 6-1 is a block diagram 
of the Wait State Generator. 


The following describes the interface signals which 
affect the operation of the Wait State Generator. 
The READYif, WSCOand WSC1 signals are inputs. 
READYOiF is the ready output signal to the host 
processor. 


READYif is an active LOW input signal which indi- 
cates to the 82370 the completion of a bus cycle. In 
the Master mode (e.g. 82370 initiated DMA transfer), 
this signal is monitored to determine whether a pe- 
ripheral or memory needs wait states inserted in the 
current bus cycle. In the Slave mode, it is used (to- 
gether with the ADSif signal) to trace CPU bus cy- 
cles to determine if the current cycle is pipelined. 


6.2.2 READYOiF 


READYOiF (Ready OutiF) is an active LOW output 
signal and is the output of the Wait State Generator. 
The number of wait states generated depends on 
the WSC(0-1) inputs. Note that special cases are 
handled for access to the 82370 internal registers 
and for the Refresh cycles. For 82370 internal regis- 
ter access, READYOiF will be delayed to take into 
the command recovery time of the register. One or 
more wait states will be generated in a pipelined cy- 
cle. During refresh, the number of wait states will be 
determined by the preprogrammed value in the Re- 
fresh Wait State Register. 


In the simplest configuration, READYOiF can be 
connected to the READYif input of the 82370 and 
the 80376 CPU. This is, however, not always the 
case. If external circuitry is to control the READYif 
inputs as well, additional logic will be required (see 
Application Issues). 


These two Wait State Control inputs, together with 
the M/IOiF input, select one of the three pre-pro- 
grammed 8-bit Wait State Registers which deter- 
mines the number of wait states to be generated. 
The most significant half of the three Wait State 
Registers corresponds to memory accesses, the 
least significant half to I/O accesses. The combina- 
tion WSC(0-1) = 11 disables the Wait State Gener- 
ator. 


WSCO 


WSCI 


M/IO# 


REGISTER 
SELECT 
LOGIC 


07 
04 03 
DO 


MEMORY 0 
I/O 
0 


MEMORY 1 
I/O 
1 


MEMORY 2 
I/o 
2 


(RESERVED) 
REfRESH 
ADS#, 
READY# 


inter 


The timing diagram of two typical non-pipelined cy- 
cles with 82370 generated wait states is shown in 
Figure 6-2. In this diagram, it is assumed that the 
internal registers of the 82370 are not addressed. 
During the first T2 state of each bus cycle, the Wait 
State Control and the M/IO# 
inputs are sampled to 
determine which Wait State Register (if any) is se- 
lected. If the WSC inputs are active (Le.not both are 
driven HIGH), the pre-programmed number of wait 
states corresponding to the selected Wait State 
Register will be requested. This is done by driving 
the READYO# output HIGH during the end of each 
T2 state. 


The WSC (0-1) inputs need only be valid during the 
very first T2 state of each non-pipelined cycle. As a 
general rule, the WSC inputs are sampled on the 
rising edge of the next clock (82384 ClK) after the 
last state when ADS# (Address Status) is asserted. 


The number of wait states generated depends on 
the type of bus cycle, and the number of wait states 
requested. The various combinations are discussed 
below. 


1. Access the 82370 internal registers: 2 to 5 wait 
states, depending upon the specific register ad- 
dressed. Some back-to-back sequences to the Inter- 
rupt Controller will require 7 wait states. 


11 
T2 
T2 


CLK2 


CLK 


A(l 
- 
23) 
M/IO# 
BLE#,BHE# 


WSC(O-1) 


ADS# 


READY# 


READYO# 


ONE WAIT 
STATE 


2. Interrupt Acknowledge to the 82370: 5 wait states. 


3. Refresh: As programmed in the Refresh Wait 
State Register (see Register Set Overview). Note 
that if WCS (0-1) = 11, READYO# will stay inac- 
tive. 


4. Other bus cycles: Depending on WCS (0-1) and 
M/IO# inputs, these inputs select a Wait State Reg- 
ister in which the number of wait states will be equal 
to the pre-programmed wait state count in the regis- 
ter plus 1. The Wait State Register selection is de- 
fined as follows (Table 6-1). 


Table 6·1. Walt State Register 
Selection 


MIIO# 
WSC(O-1) 
Register 
Selected 


0 
00 
WAIT REG 0 (I/O half) 
0 
01 
WAIT REG 1 (I/O half) 
0 
10 
WAIT REG 2 (I/O half) 
1 
00 
WAIT REG 0 (MEM half) 
1 
01 
WAIT REG 1 (MEM half) 
1 
10 
WAIT REG 2 (MEM half) 
X 
11 
Wait State Gen. Disabled 


The Wait State Control signals, WSC (0-1), can be 
generated with the address decode and the Read/ 
Write control signals as shown in Figure 6-3. 


inter 


Addre •• 
Oecode 
~ 
• 
W/RII~WSC(O-l) 
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Figure 6·3. WSC (0-1) 
Generation 


Note that during HALT and SHUTDOWN, the num- 
ber of wait states will depend on the WSC (0-1) 
inputs, which will select the memory half of one of 
the Wait State Registers (see CPU Reset and Shut- 
down Detect). 


The timing diagram of two typical pipelined cycles 
with 82370 generated wait states is shown in Figure 
6-4. Again, in this diagram, it is assumed that the 
82370 internal registers are not addressed. As de- 
fined in the timing of the 80376 processor, the Ad- 
dress (A1-23), 
Byte Enable (BHE#, BLE#), and 
other control signals (M/IO#, 
ADS#) are asserted 
one T-state earlier than in a non-pipelined cycle; i.e. 
they are asserted at T2P. Similar to the non-pipe- 
lined case, the Wait State Control (WSC) inputs are 
sampled in the middle of the state after the last state 
the ADS# signal is asserted. Therefore, the WSC 
inputs should be .asserted during the T1P state of 
each pipelined cycle (which is one T-state earlier 
than in the non-pipelined cycle). 


Tlp 
T2 
T2p 
Tlp 


CLK2 


CLK 


A(1-23) 
1.1/1011 
BLEil. BHEII 
WSC(0-1) 


AOSII 


REAOYII 


REAOYOII 


ONE WAIT STATE 


The number of wait states generated in a pipelined 
cycle is selected in a similar manner as in the non- 
pipelined case discussed in the previous section. 
The only difference here is that the actual number of 
wait states generated will be one less than that of 
the non-pipelined cycle. This is done automatically 
by the Wait State Generator. 


6.3.3 EXTENDING 
AND EARLY 
TERMINATING 
BUS CYCLE 


The 82370 allows external logic to either add wait 
states or cause early termination of a bus cycle by 
controlling the READY# input to the 82370 and the 
host processor. A possible configuration is shown in 
Figure 6-5. 


The EXT. RDYII (External Ready) signal of Figure 6- 
5 allows external devices to cause early termination 
of a bus cycle. When this signal is asserted LOW, 
the output of the circuit will also go LOW (even 
though the READYOI1 of the 82370 may still be 
HIGH). This output is fed to the READYII input of 
the 80376 and the 82370 to indicate the completion 
of the current bus cycle. 


Similarly, the 
EXT. NOT READY (External Not 
Ready) signal is used to delay the READYII input of 
the processor and the 82370. As long as this signal 
is driven HIGH, the output of the circuit will drive the 
READYII input HIGH. This will effectively extend the 
duration of a bus cycle. However, it is important to 


CLK 


A(l - 23) 
104/10# 
BLE#,BHE# 


ADS# 


note that if the two-level logic is not fast enough to 
satisfy the READYII setup time, the OR gate should 
be eliminated. Instead, the 82370 Wait State Gener- 
ator can be disabled by driving both WSC (0-1) 
HIGH. In this case, the addressed memory or I/O 
device should activate the external READY'*' input 
whenever it is ready to terminate the current bus 
cycle. 


Figures 6-6 and 6-7 show the timing relationships of 
the ready signals for the early termination and exten- 
sion of the bus cycles. Section 6-7, Application Is- 
sues, contains a detailed timing analysis of the ex- 
ternal circuit. 


CLK 


A(l - 23) 
104/10# 
BLE#,BHE# 
ADS# 


inter 


Due to the following 
implications, 
it should 
be noted 
that early termination 
of bus cycles 
in which 
82370 
internal 
registers 
are accessed 
is not recommended. 


i. Erroneous 
data may be read from or written 
into 
the addressed 
register. 


2. The 82370 
must be allowed 
to recover 
either be- 
fore HLDA (Hold Acknowledge) 
is asserted 
or before 
another 
bus cycle 
into an 82370 
internal 
register 
is 
initiated. 


The recovery 
time, 
in clock 
periods, 
equals 
the re- 
maining 
wait states that were avoided 
plus 4. 


Altogether, 
there are four 8-bit internal 
registers 
as- 
sociated 
with the Wait State Genertor. 
The port ad- 
dress 
map of these 
registers 
is shown 
below in Ta- 
ble 6-2. A detailed 
description 
of each follows. 


Table 6·2. Register 
Address 
Map 


Port Address 
Description 


72H 
Wait State Reg 0 (read/write) 
73H 
Wait State Reg 1 (read/write) 
74H 
Wait State Reg 2 (read/write) 
75H 
Ref. Wait State Reg (read/write) 


These three 8-bit read/write 
registers 
are functional- 
ly identical. 
They 
are 
used 
to 
store 
the 
pre-pro- 
grammed 
wait state count. One half of each register 
contains 
the wait state count for I/O accesses 
while 
the 
other 
half contains 
the 
count 
for 
memory 
ac- 
cesses. 
The total 
number 
of wait states 
generated 
will depend 
on the type of bus cycle. For a non-pipe- 
lined cycle, the actual number of wait states request- 
ed is equal 
to the 
wait 
state 
count 
plus 
1. For a 
pipelined 
cycle, 
the 
number 
of wait 
states 
will 
be 
equal to the wait state count in the selected 
register. 


Therefore, 
the Wait 
State 
Generator 
is capable 
of 
generating 
1 to 
16 
wait 
states 
in 
non-pipelined 
mode, and 0 to 15 wait states in pipelined 
mode. 


Note that the minimum 
wait state count in each reg- 


ister 
is O. This 
is equivalent 
to 0 wait states 
for a 
pipelined 
cycle and 1 wait state for a non-pipelined 
cycle. 


Similar to the Wait State Registers 
discussed 
above, 


this 4-bit register 
is used to store the number of wait 
states to be generated 
during a DRAM refresh cycle. 


Note that the Refresh 
Wait State Register 
is not se- 


lected 
by the 
WSC 
inputs. 
It will 
automatically 
be 
chosen 
whenever 
a DRAM 
refresh 
cycle 
occurs. 
If 
the Wait State 
Generator 
is disabled 
during the re- 
fresh cycle (WSC (0-1) 
= 11), READYO# 
will stay 
inactive 
and the Refresh 
Wait State 
Register 
is ig- 
nored. 


Using the Wait State Generator 
is relatively 
straight- 


forward. 
No special 
programming 
sequence 
is re- 


quired. 
In order 
to ensure 
the expected 
number 
of 
wait states 
will be generated 
when a register 
is se- 
lected, 
the 
registers 
to 
be 
used 
must 
be 
pro- 


grammed 
after power-up 
by writing 
the appropriate 
wait state 
count 
into each 
register. 
Note that 
upon 
hardware 
reset, 
all Wait State 
Registers 
are initial- 
ized with the value 
FFH, giving the maximum 
num- 


ber of wait states 
possible. 
Also, each 
register 
can 
be read 
to check 
the 
wait 
state 
count 
previously 


stored 
in the register. 


6.6 
Register Bit Definition 


WAIT 
STATE 
REGISTER 
0,1,2 


Port Address 
Description 


72H 
Wait State Register 
0 (read/write) 
73H 
Wait State Register 
1 (read/write) 
74H 
Wait State Register 
2 (read/write) 


I/o 
WAIT 


STATE COUNT 


MEMORY WAIT STATE COUNT 
290164-66 


As mentioned in section 6.3.3, wait state cycles gen- 
erated by the 82370 can be terminated early or ex- 
tended longer by means of additional external logic 
(see Figure 6-5). In order to 
ensure that 
the 
READY# input timing requirement of the 80376 and 
the 82370 is satisfied, special care must be taken 
when designing this external control logic. This sec- 
tion addresses the design requirements. 


A simplified block diagram of the external logic along 
with the READY# timing diagram is shown in Figure 
6-8. The purpose is to determine the maximum delay 


time allowed in the external control logic in order to 
satisfy the READY# setup time. 


First, it will be assumed that the 80376 is running at 
16 MHz (Le. CLK2 is 32 MHz). Therefore, one bus 
state 
(two CLK2 periods) will be equivalent to 
62.5 ns. According to the AC specifications of the 
82370 the maximum delay time for valid READYO# 
signal 'is 31 ns after the rising edge of CLK2 in the 
beginning of T2 (for non-pipelined cycle) or T2P (for 
pipelined cycle). Also, the minimum READY# setup 
time of the 80376 and the 82370 should be 19 ns 
before the rising edge of CLK2 at the beginning of 
the next bus state. This limits the total delay time for 
the external READY# control logic to be 12.5 ns 
(62.5-31-19) 
in order to meet the READY# setup 
timing requirement. 


A = PHil + PH12 = 62.5 ns 
B = Maximum 
READYO" 
Valid Delay = 35 ns 


C = READY •• Setup Time = 20 ns 
D = Maximum 
Ready 
Control 
Logic 
Delay = A - B • C = 7.5 ns 


Figure 6·8. 'READY' 
Timing Consideration 


intJ 


7.1 
Functional 
Description 


The 82370 DRAM Refresh Controller consists of a 
24-bit Refresh Address Counter and Refresh Re- 
quest logic for DRAM refresh operations (see Figure 
7-1). TIMER 1 can be used as a trigger signal to the 
DRAM Refresh Request logic. The Refresh Bus Size 
can be programmed to be 8- or 16-bit wide. Depend- 
ing on the Refresh Bus Size, the Refresh Address 
Counter will be incremented with the appropriate val- 
ue after every refresh cycle. The internal logic of the 
82370 will give the Refresh operation the highest 
priority in the bus control arbitration process. Bus 
control is not released and re-requested if the 82370 
is already a bus master. 


The 
dual 
function 
output 
pin 
of 
TIMER 
1 
(TOUT1/REF #) can be programmed to generate 
DRAM Refresh signal. If this feature is enabled, the 
rising edge of TIMER 1 output (TOUT1#) will trigger 
the DRAM Refresh Request logic. After some delay 
for gaining access of the bus. the 82370 DRAM Con- 
troller will generate a DRAM Refresh signal by driv- 
ing REF# output lOW. This signal is cleared after 
the refresh cycle has taken place. or by a hardware 
reset. 


DRA~ 
REFRESH 
CONTROLLER 


If the 
DRAM 
Refresh feature 
is disabled, the 
TOUT1/REF # output pin is simply the TIMER 1 out- 
put. Detailed information of how TIMER 1 operates 
is discussed in section 6-Programmable 
Interval 
Timer, and will not be repeated here. 


In order to ensure data integrity of the DRAMs, the 
82370 gives the DRAM Refresh signal the highest 
priority in the arbitration logic. It allows DRAM Re- 
fresh to interrupt DMA in progress in order to per- 
form the DRAM Refresh cycle. The DMA service will 
be resumed after the refresh is done. 


In case of a DRAM Refresh during a DMA process, 
the cascaded device will be requested to get off the 
bus. This is done by de-asserting the EDACK signal. 
Once DREQn goes inactive, the 82370 will perform 
the refresh operation. Note that the DMA controller 
does not completely relinquish the system bus dur- 
ing refresh. The Refresh Generator simply "steals" 
a bus cycle between DMA accesses. 


Figure 7-2 shows the timing diagram of a Refresh 
Cycle. Upon expiration of TIMER 1, the 82370 will try 
to take control of the system bus by asserting 
HOLD. As soon as the 82370 see HlDA go active, 
the DRAM Refresh Cycle will be carried out by acti- 
vating the REF# signal as well as the address and 
control signals on the system bus (Note that REF# 
will not be active until two ClK periods HlDA is as- 
serted). The address bus will contain the 24-bit ad- 


INTERNAL 
D~A 
HANDSHAKE 
D~A 
CONTROLLER 
ARBITRATION 
LOGIC 


2.-BIT 
REFRESH 
ADDRESS 


TO D~A 
CONTROLLER 
(INTERNAL) 


inter 


dress currently in the Refresh Address Counter. The 
control signals are driven the same way as in a 
Memory Read cycle. This "read" operation is com- 
plete when the READY# 
signal is driven LOW. 


Then, the 82370 will relinquish the bus by de-assert- 
ing HOLD. Typically, a Refresh Cycle without wait 
states will take five bus states to execute. If "n" wait 
states are added, the Refresh Cycle will last for five 
plus "n" bus states. 


How often the Refresh Generator will initiate are· 
fr~sh cycle depends on the frequency of CLKIN as 
WIll as TIMER 1's programmed mode of operation. 
For this specific application, TIMER 1 should be pro- 
grammed to operate in Mode 2 to generate a con- 
stant clock rate. See section 6-Programmable 
In· 
terv~1Timer for more information on programming 
the timer. One DRAM Refresh Cycle will be generat- 
ed each time TIMER 1 expires (when TOUT1 chang- 
es from LOW to HIGH). 
. 


The Wait State Generator can be used to insert wait 
states during a refresh cycle. The 82370 will auto- 
matically insert the desired number of wait states as 
programmed in the Refresh Wait State Register (see 
Wait State Generator). 


HLDA 


A(1-23), 
tol/IO# 
BLE#,D/C# 
W/R#, BHE# 
Toun 


7.4.1 WORD 
SIZE AND REFRESH 
ADDRESS 
COUNTER 


The 82370 supports 8- and 16-bit refresh cycle. The 
bus width during a refresh cycle is programmable 
(see Progr.amming). The bus size can be pro- 
wammed ".Iathe Refresh Control Register (see Reg- 
Ister Overview). If the DRAM bus size is 8- or 16-bits, 
the Refresh Address Counter will be incremented by 
1 or 2, respectively. 


The Refresh Address Counter is cleared by a hard- 
ware reset. 


7.5 
Register Set Overview 


The Refresh Generator has two internal registers to 
control its operation. They are the Refresh Control 
Register and the Refresh Wait State Register. Their 
port address map is shown in Table 7-1 below. 


inter 


Port Address 
Description 


1CH 
Refresh Control Reg. (read/write) 
75H 
Ref. Wait State Reg. (read/write) 


The Refresh Wait State Register is not part of the 
Refresh Generator. It is only used to program the 
number of wait states to be inserted during a refresh 
cycle. This register is discussed in detailed in section 
7 (Wait State Generator) and will not be repeated 
h~a 
. 


This 2-bit register serves two functions. First, it is 
used to enable/disable the DRAM Refresh function 
output. If disabled, the output of TIMER 1 is simply 
used as a general purpose timer. The second func- 
tion of this register is to program the DRAM bus size 
for the refresh operation. The programmed bus size 
also determines how the Refresh Address Counter 
will be incremented after each refresh operation. 


7.6 
Programming 


Upon hardware reset, the DRAM Refresh function is 
disabled (the Refresh Control Register is cleared). 
The following programming steps are needed before 
the Refresh Generator can be used. Since the rate 
of refresh cycles depends on how TIMER 1 is pro- 
grammed, this timer must be initialized with the de- 
sired mode of operation as well as the correct 
refresh interval (see Programming Interval Timer). 
Whether or not wait states are to be generated dur- 
ing a refresh cycle, the Refresh Wait State Register 
must also be programmed with the appropriate val- 
ue. Then, the DRAM Refresh feature must be en- 
abled and the DRAM bus width should be defined. 
These can be done in one step by writing the appro- 


priate control word into the Refresh Control Register 
(see Register Bit Definition). After these steps are 
done, the refresh operation will automatically be in- 
voked by the Refresh Generator upon expiration of 
Timer 1. 


In addition to the above programming steps, it 
should be noted that after reset, although the 
TOUT1/REF # 
becomes the Time 1 output, the 
state of this pin in undefined. This is because the 
Timer module has not been initialized yet. Therefore, 
if this output is used as a DRAM Refresh signal, this 
pin should be disqualified by external logic until the 
Refresh function is enabled. One simple solution is 
to logically AND this output with HLDA, since HLDA 
should not be active after reset. 


7.7 
Register Bit Definition 


REFRESH 
CONTROL 
REGISTER 


8.0 
RELOCATION 
REGISTER 
AND 
ADDRESS 
DECODE 


8.1 
Relocation 
Register 


All the integrated peripheral devices in the 82370 
are controlled by a set of internal registers. These 
registers span a total of 256 consecutive address 
locations (although not all the 256 locations are 
used). The 82370 provides a Relocation Register 
which allows the user to map this set of internal reg- 
isters into either the memory or I/O address space. 
The function of the Relocation Register is to define 
the base address of the internal register set of the 
82370 as well as if the registers are to be memory- 
or I/O-mapped. The format of the Relocation Regis- 
ter is depicted in Figure 9-1. 


00 
REF. 
DISABLED 
01 
INTEL 
RESERVED 
lOBUS 
SIZE = 16 
11 
BUS 
SIZE = 8 


inter 


o -I/O 
MAPPED 
1 - MEMORY 
MAPPED 


Note that the Relocation Register is part of the inter- 
nal register set of the 82370. It has a port address of 
7FH. Therefore, any time the content of the Reloca- 
tion Register is changed, the physical location of this 
register will also be moved. Upon reset of the 82370, 
the 
content 
of the 
Relocation Register will be 
cleared. This implies that the 82370 will respond to 
its I/O addresses in the range of OOOOHto OOFFH. 


As shown in the figure, Bit 0 of the Relocation Regis- 
ter determines whether the 82370 registers are to be 
memory-mapped or I/O mapped. When Bit 0 is set 
to '0', the 82370 will respond to I/O Addresses. Ad- 
dress signals BHE#, BLE#, A1-A7 will be used to 
select one of the internal registers to be accessed. 
Bit 1 to Bit 7 of the Relocation Register will corre- 
spond to A9 to A15 of the Address bus, respectively. 
Together with A8 implied to be '0', A15 to A8 will be 
fully decoded by the 82370. The following shows 
how the 82370 is mapped into the I/O address 
space. 


82370 will respond to I/O address range from 
OCEOOHto OCEFFH. 


Therefore, this I/O mapping mechanism allows the 
82370 internal registers to be located on any even, 
contiguous, 256 byte boundary of the system I/O 
space. 


When Bit 0 of the Relocation Register is set to '1', 
the 82370 will respond to memory addresses. Again, 


Address signals BHE#, BLE#, A1-A7 will be used 
to select one of the internal registers to be ac- 
cessed. Bit 1 to Bit 7 of the Relocation Register will 
correspond to A17-A23, 
respectively. A16 is as- 
sumed to be '0', and A8-A15 are ignored. Consider 
the following example. 


Example 


The 82370 will respond to memory addresses in 
the range of A6XXOOHto A60XXFFH (where 'X' is 
don't care). 


This scheme implies that the internal registers can 
be located in any even, contiguous, 2""16 byte page 
of the memory space. 


As mentioned previously, the 82370 internal regis- 
ters do not occupy the entire contiguous 256 ad- 
dress locations. Some of the locations are 'unoccu- 
pied'. The 82370 always decodes the lower 8 ad- 
dress signals (BHE#, BLE#, A1-A7) to determine if 
anyone of its registers is being accessed. If the ad- 
dress does not correspond to any of its registers, the 
82370 will not respond. This allows external devices 
to be located within the 'holes' in the 82370 address 
space. Note that there are several unused address- 
es reserved for future Intel peripheral devices. 


8.3 Chip-Select (CHPSEL#) 


The Chip-Select signal (CHPSEL#) will go active 
when the 82370 is addressed in a Slave bus 


82370 
NOT ACCESSED 


Tl 
T2 


82370 
ACCESSED- 
2 WAIT STATES 


T1 
T2 
T2 
T2 


cycle (either read or write), or in an interrupt ac- 
knowledge cycle in which the 82370 will drive the 
Data Bus. For a given bus cycle, CHPSEL# be- 
comes active and valid in the first T2 (in a non-pipe- 
lined cycle) or in T1P (in a pipelined cycle). It will 
stay valid until the cycle is terminated by READY# 
driven active. As CHPSEL# becomes valid well be- 
fore the 82370 drives the Data Bus, it can be used to 
control the transceivers that connect the local CPU 
bus to the system bus. The timing diagram of 
CHPSEL# is shown in Figure 8-2. 


9.0 CPU RESET AND SHUTDOWN 
DETECT 


The 82370 will activate the CPURST signal to reset 
the host processor when one of the following condi- 
tions occurs: 
- 
82370 RESET is active; 
- 
82370 detects a 80376 Shutdown cycle (this fea- 
ture can be disabled); 
- 
CPURST software command is issued to 80376. 


Whenever the CPURST signal is activated, the 
82370 will reset its own internal Slave-Bus state ma- 
chine. 


Following a hardware reset, the 82370 will assert its 
CPURST output to reset the host processor. This 
output will stay active for as long as the RESET input 
is active. During a hardware reset, the 82370 internal 
registers will be initialized as defined in the corre- 
sponding functional descriptions. 


9.2 Software Reset 


CPURST can be generated by writing the following 
bit pattern into 82370 register location 64H. 
D7 
DO 
1111XXXO 


The Write operation into this port is considered as 
an 82370 access and the internal Wait State Gener- 
ator will automatically determine the required num- 
ber of wait states. The CPURSTwill be active follow- 
ing the completion of the Write cycle to this port. 
This signal will last for 62 CLK2 periods. The 82370 
should not be accessed until the CPURST is deacti- 
vated. 


This internal port is Write-Only and the 82370 will 
not respond to a Read operation to this location. 
Also, during a software reset command, the 823?0 
will reset its Slave-Bus state machine. However, Its 
internal registers remain unchanged. This allows the 
operating system to distinguish a 'warm' reset by 
reading any 82370 internal register previously pro- 
grammed for a non-default value. The Diagnostic 
registers can be used for this purpose (see Internal 
Control and Diagnostic Ports). 


The 82370 is constantly monitoring the Bus Cycle 
Definition signals (M/IO#, 
D/C#, 
W/R#) 
and is 
able to detect when the 80376 is in a Shutdown bus 
cycle. Upon detection of a processor shutdown, the 
82370 will activate the CPURST output for 62 CLK2 
periods to reset the host processor. This signal is 
generated after the Shutdown cycle is terminated by 
the READY# signal. 


inter 


Although the 82370 Wait State Generator will not 
automatically respond to a Shutdown (or Halt) cycle, 
the Wait State Control inputs (WSCO,WSC1) can be 
used to determine the number of wait states in the 
same manner as other non-82370 bus cycles. 


This Shutdown Detect feature can be enabled or dis- 
abled by writing a control bit in the Internal Control 
Port at address 61H (see Internal Control and Diag- 
nostic Ports). This feature is disabled upon a hard- 
ware reset of the 82370. As in the case of Software 
Reset, the 82370 will reset its Slave-Bus state ma- 
chine but will not change any of its internal register 
contents. 


10.0 
INTERNAL 
CONTROL 
AND 
DIAGNOSTIC 
PORTS 


The format of the Internal Control Port of the 82370 
is shown in Figure 10-1. This Control Port is used to 
enable/ disable 
the 
Processor Shutdown 
Detect 
mechanism as well as controlling the Gate inputs of 
the Timer 2 and 3. Note that this is a Write-Only port. 
Therefore, the 82370 will not respond to a read op- 
eration to this port. Upon hardware reset, this port 
will be cleared; Le., the Shutdown Detect feature 
and the Gate inputs of Timer 2 and 3 are disabled. 


10.2 
Diagnostic 
Ports 


Two 8-bit read/write Diagnostic Ports are provided 
in the 82370. These are two storage registers and 
have no effect on the operation of the 82370. They 
can be used to store checkpoint data or error codes 
in the power-on sequence and in the diagnostic 
service routines. As mentioned in the CPU RESET 
AND SHUTDOWN DETECT section, these Diagnos- 
tic Ports can be used to distinguish between 'cold' 
and 'warm' reset. Upon hardware reset, both Diag- 
nostic Ports are cleared. The address map of these 
Diagnostic Ports is shown in Figure 10-2. 


Port 
Address 


Diagnostic Port 1 
(Read/Write) 
80H 
Diagnostic Port 2 
(Read/Write) 
88H 


There are nineteen I/O ports in the 82370 address 
space which are reserved for Intel future peripheral 
device use only. Their address locations are: 10H, 
12H, 14H, 16H, 2AH, 3DH, 3EH, 45H, 46H, 76H, 
77H, 7DH, 7EH, CCH, CDH, DOH, D2H, D4H, and 
D6H. These addresses should not be used in the 
system since the 82370 will respond to read/write 
operations to these locations and bus contention 
may occur if any peripheral is assigned to the same 
address location. 


12.0 PACKAGE 
THERMAL 
SPECIFICATIONS 


The intel 82370 Integrated System Peripheral is 
specified for operation when case temperature is 
within the range of ooe to 78°e for the ceramic 
132-pin PGA package, and 68°e for the 100-pin 
plastic package. The case temperature may be mea- 
sured in any environment, to determine whether the 
82370 is within specified operating range. The case 
temperature should be measured at the center of 
the top surface opposite the pins. 


calculated from the 0ic and 0ia from the following 
equations: 


TJ = Tc + P08jc 


TA = Tj - 
P08ja 


Tc = Ta + PO[8ja - 
8jd 


Values for 0ia and 0jc are given in Table 12.1 for the 
100-lead fine pitch. 0ia is given at various airflows. 
Table 12.2 shows the maximum Ta allowable (with- 
out exceeding Tcl at various airflows. Note that Ta 
can be improved further by attaching "fins" 
or a 


"heat sink" to the package. P is calculated using the 
maximum hot 'cc. 
The ambient temperature is guaranteed as long as 
Tc is not violated. The ambient temperature can be 


Table 12.1 82370 Package Thermal Characteristics 
("C/W tt) 0 
d 0 
Thermal Resistances 
a 
Jcan 
1Ja 


I 
I 
I 
3 
I 
3 
I 


Package 
°Jc 
0JaVersus Airflow-ft3/mln 
(m3/sec) 


0 
200 
400 
600 
800 
1000 
(0) (1.01) (2.03) (3.04) (4.06) (5.07) 


1OOLFine Pitch 
7 
33 
27 
24 
21 
18 
17 


132L PGA 
2 
21 
17 
14 
12 
11 
10 


Table 12.282370 Maximum Allowable Ambient 
Temperature at Various Airflows 


I 
I 
I 
3 
I 
3 
I 


Package 
°Jc 
Ta(c) Versus Alrflow-ft3/mln 
(m3/sec) 


0 
200 
400 
600 
800 
1000 
(0) (1.01) (2.03) (3.04) (4.06) (5.07) 


1OOLFine Pitch 
7 
63 
74 
79 
85 
91 
92 


132L PGA 
2 
74 
83 
88 
93 
97 
99 


1OOL PQFP Pkg: 
Tc ~ Ta + P·(8ja 
- 
8jcl 


Tc ~ 
63 + 220 mA(33 
- 
7) 
T c = 63 + 220 mA(26) 
Tc = 63 + 5.72 
Tc = 68.7 


132L PGA Pkg: 
Tc ~ Ta + P·(8ja 
- 
8jcl 
Tc ~ 
74 + 220 mA(21 
- 
2) 
Tc = 74 + 220 mA(19) 
Tc=74+4.18 
Tc = 78.2 


inter 


82370 D.C. Specifications 
Functional 
Operating 
Range: 
Vcc 
= 5.0V 
± 10%; TCASE = DoC to 78°C for 132-pin 
PGA, DoC to 68°C for 1DO-pin plastic 


Symbol 
Parameter 
Description 
Min 
Max 
Units 
Notes 


VIL 
Input Low Voltage 
-0.3 
0.8 
V 
(Note 1) 


VIH 
Input High Voltage 
2.0 
Vcc + 0.3 
V 


VILC 
CLK2 Input Low Voltage 
-0.3 
0.8 
V 
(Note 1) 


VIHC 
CLK2 Input High Voltage 
VCC - 
0.8 
VCC + 0.3 
V 


VOL 
Output Low Voltage 
IOL = 4 mA: 
0.45 
V 


A1'-23. 00-15. 
BHE#. 
BLE# 
IOL = 5 mA: 
0.45 
V 
All Others 


VOH 
Output High Voltage 


IOH = -1 
mA 
A23-A1. 
015-00. 
BHE#. 
BLE# 
2.4 
V 
(Note 5) 
- 


IOH = -0.2 
mA 
A23-A1. 
015-00. 
BHE#. 
BLE# 
VCC - 
0.5 
V 
(Note 5) 


IOH = -0.9 
mA 
All Others 
2.4 
V 
(Note 5) 


IOH = -0.18 
mA 
All Others 
VCC - 
0.5 
V 
(Note 5) 


III 
Input Leakage Current 
±15 
/-LA 
All Inputs Except: 
IRQ11 # -IRQ23# 
EOP#. 
TOUT2/IRQ3# 
OREQ4/IRQ9# 


ILl1 
Input Leakage Current 
10 
-300 
/-LA 
0< 
VIN < VCC 
Inputs: 
(Note 3) 
IRQ11 # -IRQ23 
# 
EOP#. 
TOUT2/1RQ3 
OREQ4/1RQ9 


ILO 
Output Leakage Current 
±15 
/-LA 
0< 
VIN < VCC 


Icc 
Supply Current (CLK2 = 32 MHz) 
220 
mA 
(Note 4) 


CI 
, 
Input Capacitance 
12 
pF 
(Note 2) 


CCLK 
CLK2 Input Capacitance 
20 
pF 
(Note 2) 


NOTES: 
1. Minimum value is not 100% tested. 
2. fe = 1 MHz: sampled only. 
3. These pins have weak internal pullups. They sould not be left floating. 
4. Ice is specified with inputs driven to CMOS levels. and outputs driving CMOS loads. Ice may be higher if inputs are driven 
to TTL levels. or if outputs are driving TTL loads. 
5. Tested at the minimum operating frequency of the part. 


CLK2 
[ 
2V 


LEGEND: 
A-Maximum 
output delay specification 
B-Minimum 
output delay specification 
C-Minimum 
input setup specification 
D-Minimum 
input hold specification 


82370 A.C. Specifications 
These A.C. timings are tested at 1.5V thresholds, 
except as noted. 


Functional 
Operating 
Range: Vcc = 5.0V ± 10%; TCASE = O°C to 78°C for 132-pin PGA. O°C to 68°C for 
100-pin plastic 


Symbol 
Parameter 
Description 
Mln 
Max 
Units 
Notes 


Operating 
Frequency 
1l(t1 a x 2) 
4 
16 
MHz 


t1 
CLK2 Period 
31 
125 
ns 


t2a 
CLK2 High Time 
9 
ns 
At2.0V 
t2b 
CLK2 High Time 
5 
ns 
AtVcc 
- 
0.8V 
t3a 
CLK2 Low Time 
9 
ns 
At2.0V 
t3b 
CLK2 Low Time 
7 
ns 
AtO.8V 
t4 
CLK2 Fall Time 
7 
ns 
Vcc 
- 
0.8V to 0.8V 
t5 
CLK2 Rise Time 
7 
ns 
0.8V to Vcc 
- 
0.8V 


t6 
A1-A23, 
BHE#, 
BLE# 
4 
36 
ns 
CL = 120 pF 
EDACKO-EDACK2 
Valid Delay 
t7 
A1-A23, 
BHE#, 
BLE# 
4 
40 
ns 
(Note 1) 
EDACKO-EDACK3 
Float Delay 


t8 
A 1-A23. 
BHE #, BLE # Setup Time 
6 
ns 
t9 
A1-A23, 
BHE#, 
BLE# 
Hold Time 
4 
ns 


t10 
W/R#, 
M/IO#, 
D/C# 
Valid Delay 
4 
33 
ns 
CL = 75 pF 
t11 
W/R#, 
M/IO#, 
D/C# 
Float Delay 
4 
35 
ns 
(Note 1) 


inter 


82370 A.C. Specifications 
These A.C. timings are tested at 1.5V thresholds, 
except as noted. 
Functional 
Operating 
Range: Vcc 
= 5.0V ± 10%; TCASE = O°C to 78°C for 132-pin PGA, O°C to 68°C for 
100-pin plastic 
(Continued) 


Symbol 
Parameter Description 
Mln 
Max 
Units 
Notes 


t12 
W/RIt, 
M/IOIt, 
DICit 
Setup Time 
6 
ns 
t13 
W/RIt, 
MilO It , DICit 
Hold Time 
4 
ns 


t14 
ADSit 
Valid Delay 
6 
33 
ns 
CL = 50 pF 
t15 
ADSit 
Float Delay 
4 
35 
ns 
(Note 1) 


t16 
ADSit 
Setup Time 
21 
ns 
t17 
ADSit 
Hold Time 
4 
ns 


t18 
Slave Mode 00-015 
Read Valid 
3 
46 
ns 
CL = 120 pF 
t19 
Slave Mode 00-015 
Read Float 
6 
35 
ns 
(Note 1) 


t20 
Slave Mode 00-015 
Write Setup 
31 
ns 
t21 
Slave Mode DO-D15 
Write Hold 
26 
ns 


t22 
Master Mode DO-D15 
Write Valid 
4 
40 
ns 
CL= 
120pF 
t23 
Master Mode DO-D15 
Write Float 
4 
35 
ns 
(Note 1) 


t24 
Master Mode DO-D15 
Read Setup 
8 
ns 
t25 
Master Mode DO-D15 
Read Hold 
6 
ns 


t26 
READYIt 
Setup Time 
19 
ns 
t27 
READYIt 
Hold Time 
4 
ns 


t28 
WSCO-WSC1 
Setup Time 
6 
ns 
t29 
WSCO-WSC1 
Hold Time 
21 
ns 


t30 
RESET Setup Time 
13 
ns 
t31 
RESET Hold Time 
4 
ns 


t32 
READYOIt 
Valid Delay 
4 
31 
ns 
CL = 25 pF 


t33 
CPURST Valid Delay (Falling Edge Only) 
2 
18 
ns 
CL = 50 pF 


t34 
HOLD Valid Delay 
5 
33 
ns 
CL = 100 pF 


t35 
HLDA Setup Time 
21 
ns 
t36 
HLDA Hold Time 
6 
ns 


t37a 
EOPIt 
Setup (Synchronous) 
21 
ns 
t38a 
EOPIt 
Hold (Synchronous) 
6 
ns 


t37b 
EOP It Setup (Asynchronous) 
11 
ns 
t38b 
EOPIt 
Hold (Asynchronous) 
11 
ns 


t39 
EOPIt 
Valid Delay (Falling Edge Only) 
5 
38 
ns 
CL = 100 pF 
t40 
EOP# 
Float Delay 
5 
40 
ns 
(Note 1) 


t41a 
DREQ Setup (Synchronous) 
21 
ns 
t42a 
DREQ Hold (Synchronous) 
4 
ns 


t41b 
DREQ Setup (Asynchronous) 
11 
ns 
t42b 
DREQ Hold (Asynchronous) 
11 
ns 


t43 
INT Valid Delay from IRQn 
500 
ns 


t44 
NA # Setup Time 
5 
ns 
t45 
NA# 
Hold Time 
15 
ns 


82370 A.C. Specifications 
These A.C. timings are tested at 1.5V thresholds, 
except as noted. 
Functional 
Operating 
Range: Vcc = 5.0V ± 10%; TCASE = O°C to 78°C for 132-pin PGA, O°C to 68°C for 
100-pin plastic (Continued) 


Symbol 
Parameter 
Description 
Min 
Max 
Units 
Notes 


t46 
CLKIN Frequency 
DC 
10 
MHz 
t47 
CLKIN High Time 
30 
ns 
2.0V 


t48 
CLKIN Low Time 
50 
ns 
0.8V 
t49 
CLKIN Rise Time 
10 
ns 
0.8Vto 
3.7V 
t50 
CLKIN Fall Time 
10 
ns 
3.7Vto 
0.8V 


TOUT1 #/REF# 
Valid Delay 
t51 
from CLK2 (Refresh) 
4 
36 
ns 
CL = 120 pF 
t52 
from CLKIN (Timer) 
3 
93 
ns 
CL = 120 pF 


t53 
TOUT2 # Valid Delay 
3 
93 
ns 
CL = 120 pf 
(from CLKIN, Falling Edge Only) 
t54 
TOUT2 # Float Delay 
3 
36 
ns 
(Note 1) 


t55 
TOUT3 # Valid Delay 
3 
93 
ns 
CL = 120 pF 
(from CLKIN) 


t56 
CHPSEL# 
Valid Delay 
1 
35 
ns 
CL = 25 pF 


NOTE: 
1. Float condition 
occurs 
when 
the maximum 
output 
current 
becomes 
less than 
ILO in magnitude. 
Float 
delay 
is not tested. 


For testing 
purposes, 
the float condition 
occurs 
when 
the dynamic 
output 
driven 
voltage 
changes 
with current 
loads. 


inter 
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This 82370 data sheet, 
version 
-002, contains 
updates 
and improvements 
to previous 
versions. 
A revision 
summary 
is listed here for your convenience. 


The sections 
significantly 
revised 
since version 
-001 are: 


- 
Section 
12.0 Electrical 
Characteristics 
renumbered 
Section 
13.0. 


- 
Section 
12.0 Package 
Thermal 
Specifications 
added. 


- 
Section 
13.0 Electrical 
Specifications 
updated 
TeASE, VOH, Ice, T33' T39, Figure 13.6. 


- 
Appendix 
C, Programming 
the 82370 Interrupt 
Controllers, 
added. 


- 
Appendix 
D, System 
Notes, 
added. 


- 
Section 
14.0 Revision 
History added. 


APPENDIX 
A 
PORTS LISTED BY ADDRESS 


Port Address 
Description 
(HEX) 


00 
Read/Write 
DMA Channel 
0 Target Address, 
AO-A 15 
01 
Read/Write 
DMA Channel 
0 8yte Count, 80-815 
02 
Read/Write 
DMA Channel 
1 Target Address, 
AO-A 15 
03 
Read/Write 
DMA Channel 
1 8yte Count, 80-815 
04 
Read/Write 
DMA Channel 
2 Target Address, 
AO-A 15 
05 
Read/Write 
DMA Channel 
2 8yte Count, 80- 8 15 
06 
Read/Write 
DMA Channel 
3 Target Address, 
AO-A 15 
07 
Read/Write 
DMA Channel 
3 8yte Count, 80- 8 15 
08 
Read/Write 
DMA Channel 
0-3 
Status/Command 
I Register 
09 
Read/Write 
DMA Channel 
0-3 
Software 
Request 
Register 


OA 
Write DMA Channel 
0-3 
Set-Reset 
Mask Register 
08 
Write DMA Channel 
0-3 
Mode Register 
I 
OC 
Write Clear 8yte-Pointer 
FF 
OD 
Write DMA Master-Clear 
OE 
Write DMA Channel 
0-3 
Clear Mask Register 
OF 
Read/Write 
DMA Channel 
0-3 
Mask Register 
10 
Intel Reserved 
11 
Read/Write 
DMA Channel 
0 8yte Count, 816-823 
12 
Intel Reserved 
13 
Read/Write 
DMA Channel 
1 8yte Count, 816-823 
14 
Intel Reserved 
15 
Read/Write 
DMA Channel 
2 8yte Count, 816-823 
16 
Intel Reserved 
17 
Read/Write 
DMA Channel 
3 8yte Count, 816-823 
18 
Write DMA Channel 
0-3 
8us Size Register 
19 
Read/Write 
DMA Channel 0-3 
Chaining 
Register 
1A 
Write DMA Channel 
0-3 
Command 
Register 
II 
18 
Write DMA Channel 
0-3 
Mode Register 
II 
1C 
Read/Write 
Refresh Control 
Register 
1E 
Reset Software 
Request 
Interrupt 
20 
Write 8ank 8 ICW1, OCW2 or OCW3 
Read 8ank 8 Poll, Interrupt 
Request 
or In-Service 
Status Register 
21 
Write 8ank 8 ICW2, ICW3, ICW4 or OCW1 
Read 8ank 8 Interrupt 
Mask Register 
22 
Read 8ank 8 ICW2 
28 
Read/Write 
IR08 Vector Register 
29 
Read/Write 
IR09 
Vector Register 
2A 
Reserved 


intJ 


Port Address 
Description 
(HEX) 


2B 
Read/Write 
IRQ11 Vector Register 


2C 
Read/Write 
IRQ12 Vector Register 
2D 
Read/Write 
IRQ13 Vector 
Register 
2E 
Read/Write 
IRQ14 Vector 
Register 
2F 
Read/Write 
IRQ15 Vector 
Register 
30 
Write Bank A ICW1, OCW2 or OCW3 
Read Bank A Poll, Interrupt 
Request 
or In-Service 
Status Register 
31 
Write Bank A ICW2, ICW3, ICW4 or OCW1 
Read Bank A Interrupt 
Mask Register 
32 
Read Bank A ICW2 
38 
Read/Write 
IRQO Vector 
Register 
39 
Read/Write 
IRQ1 Vector 
Register 
3A 
Read/Write 
IRQ1.5 Vector Register 
3B 
Read/Write 
IRQ3 Vector Register 
3C 
Read/Write 
IRQ4 Vector Register 
3D 
Reserved 
3E 
Reserved 
3F 
Read/Write 
IRQ7 Vector Register 
40 
Read/Write 
Counter 0 Register 
41 
Read/Write 
Counter 
1 Register 
42 
Read/Write 
Counter 2 Register 
43 
Write Control Word Register 
I-Counter 
0, 1, 2 
44 
Read/Write 
Counter 3 Register 
45 
Reserved 
46 
Reserved 
47 
Write Word Register 
II-Counter 
3 
61 
Write Internal Control Port 
64 
Write CPU Reset Register 
(Data-1111 
XXXOH) 
72 
Read/Write 
Wait State Register 
0 
73 
Read/Write 
Wait State Register 
1 
74 
Read/Write 
Wait State Register 
2 
75 
Read/Write 
Refresh Wait State Register 
76 
Reserved 
77 
Reserved 
7D 
Reserved 
7E 
Reserved 
7F 
Read/Write 
Relocation 
Register 
80 
Read/Write 
Internal Diagnostic 
Port 0 
81 
Read/Write 
DMA Channel 
2 Target Address, 
A 16-A23 
82 
Read/Write 
DMA Channel 
3 Target Address, 
A 16-A23 
83 
Read/Write 
DMA Channel 
1 Target Address, 
A 16-A23 
87 
Read/Write 
DMA Channel 
0 Target Address, 
A16-A23 
88 
Read/Write 
Internal Diagnostic 
Port 1 
89 
Read/Write 
DMA Channel 
6 Target Address, 
A16-A23 
8A 
, 
Read/Write 
DMA Channel 
7 Target Address, 
A 16-A23 
8B 
Read/Write 
DMA Channel 
5 Target Address, 
A 16-A23 
8F 
Read/Write 
DMA Channel 
4 Target Address, 
A 16-A23 


inter 


Port Address 
Description 
(HEX) 


90 
Read/Write 
OMA Channel 
0 Requester 
Address, 
AO-A 15 
91 
Read/Write 
OMA Channel 
0 Requester 
Address, 
A16-A23 
92 
Read/Write 
OMA Channel 
1 Requester 
Address, 
AO-A 15 
93 
Read/Write 
OMA Channel 
1 Requester 
Address, 
A16-A23 
94 
Read/Write 
OMA Channel 
2 Requester 
Address, 
AO-A 15 
95 
Read/Write 
OMA Channel 
2 Requester 
Address, 
A 16-A23 
96 
Read/Write 
OMA Channel 
3 Requester 
Address, 
AO-A 15 
97 
Read/Write 
OMA Channel 
3 Requester 
Address, 
A 16-A23 
98 
Read/Write 
OMA Channel 
4 Requester 
Address, 
AO-A 15 
99 
Read/Write 
OMA Channel 
4 Requester 
Address, 
A16-A23 
9A 
Read/Write 
OMA Channel 
5 Requester 
Address, 
AO-A 15 
9B 
Read/Write 
OMA Channel 
5 Requester 
Address, 
A16-A23 
9C 
Read/Write 
OMA Channel 
6 Requester 
Address, 
AO-A 15 
90 
Read/Write 
OMA Channel 
6 Requester 
Address, 
A 16-A23 
9E 
Read/Write 
OMA Channel 
7 Requester 
Address, 
AO-A 15 
9F 
Read/Write 
OMA Channel 
7 Requester 
Address, 
A 16-A23 
AO 
Write Bank C ICW1, OCW2 or OCW3 
Read Bank C Poll, Interrupt 
Request 
or In-Service 
Status Register 
A1 
Write Bank C ICW2, ICW3, ICW4 or OCW1 
Read Bank C Interrupt 
Mask Register 
A2 
Read Bank C ICW2 
A8 
Read/Write 
IRQ16 Vector Register 


A9 
Read/Write 
IRQ17 Vector Register 


AA 
Read/Write 
IRQ18 Vector Register 


AB 
Read/Write 
IRQ19 Vector Register 


AC 
Read/Write 
IRQ20 Vector Register 


AD 
Read/Write 
IRQ21 Vector Register 


AE 
Read/Write 
IRQ22 Vector Register 


AF 
Read/Write 
IRQ23 Vector Register 


CO 
Read/Write 
OMA Channel 
4 Target Address, 
AO-A 15 
C1 
Read/Write 
OMA Channel 
4 Byte Count, BO-B15 
C2 
Read/Write 
OMA Channel 
5 Target Address, 
AO-A 15 
C3 
Read/Write 
OMA Channel 
5 Byte Count, BO-B15 
C4 
Read/Write 
OMA Channel 
6 Target Address, 
AO-A 15 
C5 
Read/Write 
OMA Channel 
6 Byte Count, BO-B15 
C6 
Read/Write 
OMA Channel 
7 Target Address, 
AO-A 15 
C7 
Read/Write 
OMA Channel 
7 Byte Count, BO-B15 
C8 
Read OMA Channel 
4-7 
Status/Command 
I Register 
C9 
Read/Write 
OMA Channel 
4-7 
Software 
Request 
Register 
CA 
Write OMA Channel 
4-7 
Set-Reset 
Mask Register 
CB 
Write OMA Channel 
4-7 
Mode Register 
I 
CC 
Reserved 
CO 
Reserved 
CE 
Write OMA Channel 
4-7 
Clear Mask Register 
CF 
Read/Write 
OMA Channel 
4-7 
Mask Register 
DO 
Intel Reserved 
01 
Read/Write 
OMA Channel 
4 Byte Count, B16-B23 
02 
Intel Reserved 
03 
Read/Write 
OMA Channel 
5 Byte Count, B16-B23 


inter 


Port Address 
Description 
(HEX) 


04 
Intel Reserved 
05 
Read/Write 
OMA Channel 
6 Byte Count, B16-B23 
06 
Intel Reserved 
07 
Read/Write 
OMA Channel 
7 Byte Count, B16-B23 
08 
Write OMA Channel 
4-7 
Bus Size Register 
09 
Read/Write 
OMA Channel 
4-7 
Chaining 
Register 
OA 
Write OMA Channel 
4-7 
Command 
Register 
II 
OB 
Write OMA Channel 
4-7 
Mode Register 
II 


APPENDIX 
B 
PORTS LISTED BY FUNCTION 


Port Address 
(HEX) 


DMA CONTROLLER 


00 
OC 
Write DMA Master-Clear 
Write DMA Clear Byte-Pointer 
FF 


Read/Write 
DMA Channel 
0-3 
Status/Command 
I Register 


Read/Write 
DMA Channel 
4-7 
Status/Command 
I Register 


Write DMA Channel 
0-3 
Command 
Register 
II 
Write DMA Channel 
4-7 
Command 
Register 
II 


Write DMA Channel 
0-3 
Mode Register 
I 


Write DMA Channel 
4-7 
Mode Register 
I 


Write DMA Channel 
0-3 
Mode Register 
II 
Write DMA Channel 
4-7 
Mode Register 
II 


Read/Write 
DMA Channel 
0-3 
Software 
Request 
Register 
Read/Write 
DMA Channel 
4-7 
Software 
Request 
Register 


Reset Software 
Request 
Interrupt 


Write DMA Channel 
0-3 
Clear Mask Register 
Write DMA Channel 
4-7 
Clear Mask Register 
Read/Write 
DMA Channel 
0-3 
Mask Register 
Read/Write 
DMA Channel 
4-7 
Mask Register 
Write DMA Channel 
0-3 
Set-Reset 
Mask Register 
Write DMA Channel 4-7 
Set-Reset 
Mask Register 


Write DMA Channel 
0-3 
Bus Size Register 
Write DMA Channel 
4-7 
Bus Size Register 


Read/Write 
DMA Channel 
0-3 
Chaining 
Register 
Read/Write 
DMA Channel 
4-7 
Chaining 
Register 


Read/Write 
DMA Channel 
0 Target Address, 
AO-A 15 
Read/Write 
DMA Channel 
0 Target Address, 
A 16-A23 
Read/Write 
DMA Channel 
0 Byte Count, BO-B15 
Read/Write 
DMA Channel 
0 Byte Count, B16-B23 
Read/Write 
DMA Channel 
0 Requester 
Address, 
AO-A 15 
Read/Write 
DMA Channel 
0 Requester 
Address, 
A 16-A23 


Port Addre •• 
Description 
(HEX) 


DMA CONTROLLER 
(Continued) 


02 
Read/Write 
OMA Channel 
1 Target Address, 
AO-A 15 


83 
Read/Write 
OMA Channel 
1 Target Address, 
A 16-A23 


03 
Read/Write 
OMA Channel 
1 Byte Count, BO-B15 


13 
Read/Write 
OMA Channel 
1 Byte Count, B16-B23 
92 
Read/Write 
OMA Channel 
1 Requester 
Address, 
AO-A15 
93 
Read/Write 
OMA Channel 
1 Requester 
Address, 
A16-A23 


04 
Read/Write 
OMA Channel 
2 Target Address, 
AO-A 15 
81 
Read/Write 
OMA Channel 
2 Target Address, 
A 16-A23 


05 
Read/Write 
OMA Channel 
2 Byte Count, BO-B15 
15 
Read/Write 
OMA Channel 
2 Byte Count, B16-B23 
94 
Read/Write 
OMA Channel 
2 Requester 
Address, 
AO-A 15 
95 
Read/Write 
OMA Channel 
2 Requester 
Address, 
A16-A23 


06 
Read/Write 
OMA Channel 
3 Target Address, 
AO-A 15 
82 
Read/Write 
OMA Channel 
3 Target Address, 
A16-A23 
07 
Read/Write 
OMA Channel 
3 Byte Count, BO-B15 
17 
Read/Write 
OMA Channel 
3 Byte Count, B16-B23 
96 
Read/Write 
OMA Channel 
3 Requester 
Address, 
AO-A 15 
97 
l 
Read/Write 
OMA Channel 
3 Requester 
Address. 
A16-A23 


CO 
Read/Write 
OMA Channel 
4 Target Address, 
AO-A 15 
8F 
Read/Write 
OMA Channel 
4 Target Address, 
A 16-A23 
C1 
Read/Write 
OMA Channel 
4 Byte Count, BO-B15 
01 
Read/Write 
OMA Channel 
4 Byte Count, B16-B23 
98 
Read/Write 
OMA Channel 
4 Requester 
Address, 
AO-A 15 
99 
Read/Write 
OMA Channel 
4 Requester 
Address, 
A 16-A23 


C2 
Read/Write 
OMA Channel 
5 Target Address, 
AO-A15 
8B 
, 
Read/Write 
OMA Channel 
5 Target Address, 
A16-A23 
C3 
Read/Write 
OMA Channel 
5 Byte Count, BO-B15 
03 
Read/Write 
OMA Channel 
5 Byte Count, B16-B23 
- 


9A 
Read/Write 
OMA Channel 
5 Requester 
Address, 
AO-A 15 
9B 
Read/Write 
OMA Channel 
5 Requester 
Address, 
A 16-A23 


C4 
Read/Write 
OMA Channel 
6 Target Address, 
AO-A 15 
89 
Read/Write 
OMA Channel 
6 Target Address, 
A16-A23 
C5 
Read/Write 
OMA Channel 
6 Byte Count, BO-B15 
05 
Read/Write 
OMA Channel 
6 Byte Count, B16-B23 
9C 
Read/Write 
OMA Channel 
6 Requester 
Address, 
AO-A 15 
90 
Read/Write 
OMA Channel 6 Requester 
Address, 
A 16-A23 


C6 
Read/Write 
OMA Channel 
7 Target Address, 
AO- A 15 
8A 
Read/Write 
OMA Channel 
7 Target Address, 
A 16-A23 
C7 
Read/Write 
OMA Channel 
7 Byte Count, BO-B15 
07 
Read/Write 
OMA Channel 
7 Byte Count, B16-B23 
9E 
Read/Write 
OMA Channel 
7 Requester 
Address, 
AO-A 15 
9F 
Read/Write 
OMA Channel 
7 Requester 
Address, 
A 16-A23 


Port Address 
Description 
(HEX) 


INTERRUPT 
CONTROLLER 


20 
Write Bank B ICW1, OCW2 or OCW3 
Read Bank B Poll, Interrupt 
Request 
or In-Service 
Status Register 
21 
Write Bank B ICW2, ICW3, ICW4 or OCW1 
Read Bank B Interrupt 
Mask Register 
22 
Read Bank B ICW2 
28 
Read/Write 
IRQ8 Vector Register 
29 
Read/Write 
IRQ9 Vector Register 
2A 
Reserved 
2B 
Read/Write 
IRQ11 Vector 
Register 
2C 
Read/Write 
IRQ12 Vector 
Register 
20 
Read/Write 
IRQ13 Vector Register 
2E 
Read/Write 
IRQ14 Vector Register 
2F 
Read/Write 
IRQ15 Vector 
Register 


AO 
Write Bank C ICW1, OCW2 or OCW3 
Read Bank C Poll, Interrupt 
Request or In-Service 
Status Register 
A1 
Write Bank C fCW2, ICW3, ICW4 or OCW1 


I 
Read Bank C Interrupt 
Mask Register 
A2 
Read Bank C ICW2 
A8 
Read/Write 
IRQ16 Vector 
Register 
A9 
Read/Write 
IRQ1? Vector 
Register 
AA 
Read/Write 
IRQ18 Vector 
Register 
AB 
Read/Write 
IRQ19 Vector 
Register 
AC 
Read/Write 
IRQ20 Vector 
Register 
AD 
Read/Write 
IRQ21 Vector Register 
AE 
Read/Write 
IRQ22 Vector 
Register 
AF 
Read/Write 
IRQ23 Vector 
Register 


30 
Write Bank A ICW1, OCW2 or OCW3 
Read Bank A Poll, Interrupt 
Request 
or In-Service 
Status Register 
31 
Write Bank A ICW2, ICW3, ICW4 or OCW1 
Read Bank A Interrupt 
Mask Register 
32 
Read Bank A ICW2 
38 
Read/Write 
IRQO Vector Register 
39 
Read/Write 
IRQ1 Vector 
Register 
3A 
Read/Write 
IRQ1.5 Vector 
Register 
3B 
Read/Write 
IRQ3 Vector Register 
3C 
Read/Write 
IRQ4 Vector Register 
3D 
Reserved 
3E 
Reserved 
3F 
Read/Write 
IRQ? Vector Register 


40 
41 
42 
43 
44 
47 


CPU RESET 


64 


WAIT STATE GENERATOR 


72 
73 
74 
75 


DRAM REFRESH 
CONTROLLER 


1C 
Read/Write 
Refresh Control 
Register 


INTERNAL 
CONTROL 
AND DIAGNOSTIC 
pORTS 


61 
Write Internal Control 
Port 
80 
Read/Write 
Internal Diagnostic 
Port 0 
88 
Read/Write 
Internal 
Diagnostic 
Port 1 


RELOCATION 
REGISTER 


7F 


INTEL RESERVED 
PORTS 


(HEX) 
I 


PROGRAMMABLE 
INTERVAL 
TIMER 


Read/Write 
Counter 0 Register 
Read/Write 
Counter 
1 Register 
Read/Write 
Counter 2 Register 
Write Control Word Register 
I-Counter 
0, 1, 2 


Read/Write 
Counter 3 Register 
Write Word Register 
II-Counter 
3 


Read/Write 
Wait State Register 0 
Read/Write 
Wait State Register 
1 
Read/Write 
Wait State Register 
2 
Read/Write 
Refresh Wait State Register 


Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 


APPENDIX 
C 
PROGRAMMING 
THE 82370 INTERRUPT 
CONTROLLERS 


This Appendix 
describes 
two methods 
of programming 
and initializing 
the Interrupt 
Controllers 
of the 82370. A 


simple interrupt 
service 
routine is also shown which provides 
compatibility 
with the 82C59 Interrupt 
Controller. 


The two methods 
of programming 
the 8237" 
Interrupt 
Controllers 
are needed 
to provide 
simple 
initialization 
procedures 
in different 
software 
environments. 
For new applications, 
a simple 
initialization 
and programming 
sequence 
can 
be used. 
For PC-DOS 
or other 
applications 
which 
expect 
8259s, 
an interrupt 
handler 
for 


initialization 
traps must be provided. 
Once the handler 
is in place, all three 82370 
Interrupt 
Controller 
banks 


can be programmed 
or initialized 
in the same manner 
as an 8259. 


The 
ICW2 
interrupt 
is generated 
by the 
82370 
when 
writing 
the 
ICW2 
command 
to any of the 
interrupt 
controller 
banks. This interrupt 
is supplied 
to provide compatibility 
to existing code that expects 
to be program- 
ming 82C59s. 
The ICW2 value 
is stored 
in the ICW2 register 
of the associated 
bank, but is ignored 
by the 
controller. 
It is the responsibility 
of the ICW2 interrupt 
handler 
to read the ICW2 register 
and use its value to 
program 
the individual 
vector 
registers 
accordingly. 


New applications 
do not generally 
require compatibility 
with previous 
code, or at least the code is usually easily 
modifiable. 
If the application 
fits this description, 
then the ICW2 interrupt 
can be ignored. 
This 
is done 
by 
initializing 
the 
interrupt 
controller 
as necessary, 
and 
before 
enabling 
CPU 
interrupts, 
removing 
the 
ICW2 
interrupt 
request 
by reading 
the ICW2 register. 
Listing 
1 shows 
the code for doing this for bank A. The same 
procedure 
can be used for the other banks. 


Listing 1. 
Initialization 
of an 82370 Interrupt 
Controller 
Bank 
Without 
ICW2 Interrupts 


;initialize controller 
mov al,ICWl 
out 30h,al 
mov al,ICW2 
out 3lh,al 
mov al,ICW3 
out 3lh,al 
mov al,ICW4 
out 3lh,al 


logic 
;begin sequence 


mov al,BANK_A_MASK 
out 3lh,al 


;program vector registers 


mov al,ICW2 
;IRQO 
out 38h,al 
mov al,ICW2+l 
;IRQl 
out 39h,al 
mov al,ICW2_VECTOR 
;IRQ1.5 (probably never used in 
out 3Ah,al 
; this system) 
mov al,ICW2+3 
;IRQ3 
out 3Bh,al 
mov al,ICW2+4 
;IRQ4 
out 3Ch,al 
mov al,ICW2+7 
;IRQ7 
out 3Fh,al 


;remove ICW2 interrupt 
request 


;read mask register to work around 
; A-step 
errata 


;read ICW2 register to clear 
interrupt request 


sti 
;re-enable interrupts 
ret 


info_f 
£fl'i)\Vlnd~~~nn:lllilRl~IMl/6\'ii'nIRlOO 


In applications 
where 8259 compatibility 
is required, 
the ICW2 interrupt 
handler 
must be invoked 
whenever 
an 
interrupt 
controller 
is initialized 
(ICW1-ICW2-ICWn 
sequence). 
The handler's 
purpose 
is to read the ICW2 
value from the ICW2 read-register 
and write the appropriate 
sequence 
of vectors 
to the vector registers. 
Listing 
2 shows the typical initialization 
sequence 
(this is not changed 
from the 8259), and the required initialization 
for 
operation 
of the ICW2 interrupt 
handler. 
Listing 2 shows the ICW2 interrupt 
handler. 


ListIng 2. 
InItIalizatIon 
of Bank A for ICW2 Interrupts 


mov 
al,ICWl 
out 
30h,al 
mov 
al,ICW2 
out 
3lh,al 


mov 
al,ICW3 
out 
3lh,al 


mov 
al,ICW4 
out 
;3lh,al 


;send ICW3 if necessary 
note that using ICW3 for 
cascading bank B is not required 
and will affect the way EOIs are 
required for nesting. It is 
advised that ICW3 not be used. 


mov 
al,Bank_A_Mask 
;write to mask register 
(OCWl=7Bh) 
out 
3lh,al 
;don't mask off IRQl.5 or Default 
interrupt 
(IRQ7) 


mov 
al,ICW2_VECTOR 
;IRQl.5 
out 
3Ah,al 


mov 
al,IRQ7_DEFAULT_VECTOR 
out 
3Fh,al 


;read mask register to work around 
; A-step errata 


;read ICW2 register to clear 
; interrupt request 


;at this point install interrupt call vector for ICW2, if 
;not already done somewhere else in the code 


inter 


push ax 
push cx 
push dx 


in 
al,22h 
mov 
cx,8 
mov 
dX,28h 


out 
inc 
inc 
loop 


dX,al 
al 
dx 
BANK_B_LOOP 


in 
al,OA2h 
mov 
cx,8 
mov 
dX,OA8h 


out 
inc 
inc 
loop 


dx,al 
al 
dx 
BANK_C_LOOP 


pop 
pop 
pop 
iret 


Listing 3. 
ICW2 Interrupt 
Service 
Routine 


;read ICW2 
;count vectors 
;point to vectors 


;write vector 
;next vector 
;next vector I/O address 


;read ICW2 
;count vectors 
;point to vectors 


;write vector 
;next vector 
;next vector i/o address 


inter 


30H 
write 
read 
ICWI. OCW2. OCW3 
Poll. IRR. ISR 


ICW2. ICW3. ICW4. OCWI 
IMR 
3lH 
write 
read 


32H 
read 
ICW2 read register 
38H 
read/write 
IRQO vector 
39H 
read/write 
IRQI vector 
3AH 
read/write 
IRQI.5 vector 
3BH 
read/write 
IRQ3 vector 
3CH 
read/write 
IRQ4 vector 
3DH 
RESERVED 
3EH 
RESERVED 
3FH 
read/write 
IRQ7 vector 


Bank B: 
20H 
write 
ICWI. OCW2. OCW3 
read 
Poll. IRR. ISR 
2lH 
write 
ICW2. ICW3. ICW4. OCWI 
read 
IMR 
22H 
read 
ICW2 read register 
28H 
read/write 
IRQ8 vector 
29H 
read/write 
IRQ9 vector 
2AH 
RESERVED 
2BH 
read/write 
IRQII vector 
2CH 
read/write 
IRQl2 vector 
2DH 
read/write 
IRQl3 vector 
2EH 
read/write 
IRQl4 vector 
2FH 
read/write 
IRQl5 vector 


Bank C: 


AOH 
write 
ICWI. OCW2. OCW3 
read 
Poll. IRR. ISR 


AIH 
write 
ICW2. ICW3. ICW4. OCWI 
read 
IMR 


A2H 
read 
ICW2 read register 


A8H 
read/write 
IRQl6 vector 
A9H 
read/write 
IRQl7 vector 
AAH 
read/write 
IRQl8 vector 
ABH 
read/write 
IRQl9 vector 
ACH 
read/write 
IRQ20 vector 
ADH 
read/write 
IRQ21 vector 
AEH 
read/write 
IRQ22 vector 
AFH 
read/write 
IRQ23 vector 


inter 


APPENDIX 
D 
SYSTEM NOTES 


1. SHE# 
IN MASTER 
MODE. 


In Master 
Mode, BHE # will be activated 
during DMA to/from 
8·bit devices 
residing 
at even locations 
when 
the remaining 
byte count 
is greater 
than 1. 


For example, 
if an 8·bit device is located at 00000000 
Hex and the number of bytes to be transferred 
is > 1, 


the first address/BHE# 
combination 
will be 00000000/0. 
In some systems 
this will cause the bus controller 
to perform 
two 8·bit accesses, 
the first to 0000000 
Hex and the second 
to 00000001 
Hex. However, 
the 
82370's 
DMA will only read/write 
one byte. This mayor 
may not cause a problem 
in the system depending 
on what is located 
at 00000001 
Hex. 


Solution: 


There 
are two solutions 
if BH # active 
is unacceptable. 
Of the two, number 
2 is the cleanest 
and most 
recommended. 


1. If there is an 8-bit device that uses DMA located 
at an even address, 
do not use that address + 1. The 
limitation 
of this solution 
is that the user must have complete 
control 
over what addresses 
will be used in 
the end system. 


2. Do not allow the Bus Controller 
to split cycles for the DMA. 


2. RESET 
OUTPUT 
OF 82370: 


The 80376 requires 
its RESET line to be active for 80 clock cycles. The 82370 generates 
holds the RESET 
line active for 62 clock cycles. 


The following 
design example 
shows how the user can extend the active high of the RESET line to 80 clock 


cycles. 


Extending the RESET Output of the 82370 


This section 
describes 
a hardware 
solution 
for using the 82370's 
CPURST 
output 
and the software 
reset 
command 
to cause the 80376 to enter into a self-test. 


The 80376 
requires 
two simultaneous 
events 
in order to initiate 
the self-test 
sequence. 
The RESET 
input of 
the processor 
must be held active 
for at least 80 CLK2 periods 
and the BUSY # input must be low 8 CLK2 
periods 
prior to and 8 CLK2 periods 
subsequent 
to RESET going inactive. 


A system which does not have an 80387SX 
will simply have the BUSY # input to the 80376 tied low. A system 
which 
contains 
the 80387SX 
will require 
extra 
logic 
between 
the 
BUSY # output 
of the 80387SX 
and the 
BUSY # input of the 80376 
in order to force 
self-test 
on reset. The extra BUSY # logic required 
will not be 
described 
here. 


The 82370 CPURST 
output is intended 
to be retimed with faster TIL 
components 
in order to meet the RESET 


input setup time requirements 
of the 80376 and 80387SX. 
This requires 
a 74F379 
(quad flip-flop 
with enable) 
or equivalent. 
The flip-flops 
required 
are described 
in TECHBIT 
(Ed Grochowski, 
April 10, 1987). 


The 82370 
does not meet the RESET pulse duration 
requirements 
for causing 
self-test 
of the 80376 when a 
software 
reset 
command 
is issued 
to the 
82370. 
The 82370 
provides 
a RESET 
pulse 
width 
of 62 CLK2 
periods, 
the 80376 
requires 
80 CLK2 periods 
as mentioned 
earlier. 


In order to cause the 80376 to do a self-test 
after a software 
reset, the CPURST 
output 
pulse of the 82370 
must be lengthened. 
Figure 
1 shows 
a circuit which will do this. 


intJ 


Note that 
the CPURST 
output 
is the OR of the 82370 
RESET 
input and the output 
of the software 
reset 
command 
logic, and thus will have the same duration 
as the RESET input during power-on. 


The additional 
circuitry 
required 
consists 
of an OR gate, a one-shot, 
a capacitor, 
and a resistor 
more than is 
found 
in a system 
without 
the 82370. 
The one-shot 
(74121) 
is inserted 
between 
the CPURST 
output 
of the 
82370 and the input of the retiming 
flip-flops 
(74F379). 
The period of the one-shot 
should 
be long enough 
to 
guarantee 
the 80 CLK2 periods 
that the 80376 requires. 


The OR gate (74F32) is required to guarantee 
that the 80376 is held in a RESET state while the 82370 is being 
reset. This is done to be sure that BE3 # is held low when the RESET input to the 82370 goes inactive. 
BE3 # 
is used during the reset to determine 
whether 
it is necessary 
to enter a special 
factory 
test mode. It must be 
low when the RESET input goes inactive, 
and the 80376 drives it low during reset. 


IN.l.B 
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10k 
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€OMrIlEIIENSIJ'E 
AIIUIIf'BCI'IJIlE 
DEt'EUWMENf' 
SIJPrfHlT 
1'OIl1I01'l60 
EMBEDDED APPUGlf'IONS 


Intel's 80960 Development Starter Kits provide a quick. easy and economic way to evaluate Intel's 
80960 architecture. benchmark 80960 perfonnanre. 
and begin initial application rode development 
and debug. Tools were designed specifically for the 80960 microprocessor. allowing developers to 
take full advantage of the perfonnance and ease-of·programmlng features built Into the 80000's 
RIse-based design. The 80960 DeYelopment Starter Kits are conYeniently hosted on the IBM PeAT. 
meaning deYelopers of 32·M embedded microprocessor applications can get started with minimal 
hardware investment. 


IEAf'lJllES 


• ASM·OOOmacro assembler for deYeloplng 
and tuning speed-crltical rode. 


• 
iC960 highly optimizing C language compiler 
for high·leYellanguage software development. 


• 
EVA·OOOKBplug-in software execution board 
for benchmarkJng performance. evaluating 
architecture. and developing and debugging 
application rode. 
• 
Many starter kit configurations 
for 
supporting a Wide range of development 
needs. 


• 
DOS-hosted on IBM PC/AT.and compatibles. 


• 
VAXNMS··ho6led 
ASM·OOOand iCOOOon 
VAXNMS and MicroVAXlVMS· in 04. 1988 
• Sun 3···h06led 
ASM·OOOand iCOOO in 
04.1988. 
intJ-------- 
• ~ 
••s Iftd AUcIo\l\XMlS.re 
c.rldcrNrts 
fI 01••.• 1~ulpnoa o~ 


•• Sun 3 N 'lIMm,tt: 
~Sun 
MiQwy*"", 


lrid 
QJlllInUnn 
ISaIln'lal 
no I'CIpOn~!bltlL)' fOl'LIlc IlIC fler\Y 
Of'('lJllly Wl.'I' LhlIndJ'CUJIl)'L'II'lbodiod In an Ir&cl ptudUl1. Nu 1A.hI.'rrirroll 
PI\4." 
1ia.'flIOl 
an: 


6mpHod.lnfllrmlll.lon 
mnlalnod 
IIcnlIn 8tI1)(ll'11Odl.tillf'C'/IWIiy 
pUbliIhlld 
lpocItlcaLklflll 
on Ul:tic 
ck.'\'ku1 
from IIl.CI and 
b1liUhjl'ct 
III rhlfW! 
wiUlIJUl. nuua:. 


~.198A 
e In&cI 
CarpcwaIun 
1_ 
(Jrdcr•••••tu: 2801W1oW'l 


ASM-960 
MACRO ASSEMBLER 


The -\~~1·960 macro assembler is used to finNul1l' S('('lIons 
of code for lOp program ex('('ulion speed nn the 809601<:;\, 
80960l\H, and 80960MC. ASM·960 does this hy giving 
programllwrs ahsolut~ control over pNlgram instruclions, In 
addition to th\' asSt'mhler and mUlTOpreproct'Ssor, 
-\~~1·960inrlud\'l' St•••.eral ulilitit'S for application program 
maintt'nance and ril'hug: 


• 
L1'1I\ER/LOADI'K allllV"Smultiple and incn:mt'ntal 
program nle links. 


• 
-\RCIIIH::R allows dewlopers to build applications 
fUlictioll libraI'll,:;. 


• 
DIs.\SSr;~I11Lr;R providl':; aSSt'mhlt'1'mOl'mollies. 


• ~\ ~Ill( lL DL~1Pr;Rprm kles symlXllil' informalion from (l 
IlI"gram fill' for facilitating dehug, 
• 
PKlJ\1 HlIIW~;R prouul'l's a hl'.\ fill' suitahlt' for PR()~l 
pmgl'UIIIIIII'I'S, 


;C-960 
C LAN6IJA6E 
COMPILER 


ie·gOOIS(l highb optimizing e language rompll\'1' foJ'thl' 
8U9GOl..ll(lnd 809GO~le microprocessors, iC·H60sUPlxlrts 
tht' filII C language as dt'St'l'ibed in thl' 1..I'I'II1ghuIIalld 
Rill'hit' hlXlk, Th<'C I'rogramming 
Languagl' (I'J't'ntin'·llall. 


1978). iC·900 is used in mnjunction lIilh :\~~Hl60 for 
outputting ohject mdl' files and inciudcs standard "Nfll 
extrnsions to the C language ami tlie follOWing 
enhancements for emlx'ddl'd appliealion development: 


• Constants allovv high·lewl !anguagi.'definitions and l'aSt' 


or pmgram maintt'lIaun'. 


• 
\:!emory·mapped 
I/O allows high·lt'v'!'1language (lCl'\'SS 
to application speci!'ic input and output. 


• 
Inllne 881lembly simplifit,:; tilt' intl'gl'atitJll of mnvt'nit'nt 
e language and sprrd rritical functions. 


• FIolIIJIII! polnl 8Upporl produl'l's in·line mde lO Wkt' 
full advantage of thl' floating IXlint callahility of the 
Il09MI\B 
and 809M~le, 


The DOS·hosted\ersion requires a 2MB "hovehoardN• Tht' 
e\eclilion v'ehielehosted wcsion requirt'S thl' ~\1B hoard 
(r;\:·\960I\B4\IBJ and prov'idt'Sa 5X mmpile·time spet'rl 
improv'ement over the DOS-hostedversion. 


EJlA-9BO«B 
SOITW.4RE 
EXEClJ'I'ION 
JlEHICLE 


The E\:-\·OriJI\B is a software execution vehicle for thl' 
809601\ VI\B microproct'Ssor. It is a single PCAT plug·in 
board \\ hich providt'S easy and mnvenient architecture 
{'\'aluation and benchmarking as well as software 
development. The F:\·-\·!lliOI\Bmnt<tins Ihe follOWing: 
• 
I~I byte or ~M byte of one wait·state program memory 
(DR.-\~II 
• 
6~1..h~les of zel~' lIail'state [Jlllgl'am meml'o~ (SR!\MJ 
• Thrt't' application program a('('('SSibletimers 
• 
Hosted debug munitor which supports: 2 program 
breakpoints, slllgie step program execution, I'('gi~ter and 
memory access, program download and upload 
• 
DOSacr'ess Iihraries thaLallow: scJ\x'n display, keyb<liml 
input. read and IIril<' disk files, abilit~ tu spawn a [)OS 
pnlCess which could mmmunicate to serial 01'parailt'l I/ll 
• 
20~'Hz operation 


ARCHI'I'EC'I'IJRE 
EJIALIJA'I'ION: 


S'I'AR'I'ER «1'1' I 


The B0960 D!'velopml'nt Starter I\it "'I is designed fur 
immedlatt' an'hitt~'tun' l'valuatioo ami \'Odede\l'lopnlt'nt IiiI' 
80960l\A and 809(iOI\H. It includes thl' f,\:·\·960I\B 
eXI'Colionvt'hide and \S\I·OOO ·\sSt'mhlt'r for developing 
and dehugging SIXXxl'ITltical sof!lImx' antI 1X'l'fol1l11nghaslc 
performann' Ilt'nrhmarking. 


COMrU'I'E 
ArpUCA'I'ION 
DEJlEWrMEN'I': 
S'I'AR'I'ER «''1' :l 


The 80960 IJI'vl'lopment Starter I..it -2 is (l l"mplett' 
apnliration devt'lopmt'nt tc'II~lt for dl"e1uping and 
dehugglllg huth slnxh'rittcal 
and hlgh·ll'wl software, as 
IIt'li as pt>rformlllg full 1l"Ill'hmm'klllg un tlil' 80UtJOl..ll.Tilt' 
kit int'ludl's thl' ~:\Ml(\OI..B So[tll3l'l' !o:\efutiun \i'llicle. the 
ASM·060 ,\s:;('mllll'r, and till' ,C·UtJOC Laoguage Compiler, 


SOITWARE 
DEJlELOrMEN'I': 


S'I'AR'I'ER «'T 
:1 


Thl' 80960 Ilt'wlollmt'nt 
Startt,' I..it ":~ pl"V idl'S all the 


soft\\,m' Illx'(ltxl Logl't Stlll'LIXIOIl all IlhaSt's, I'lth slXx'd 
ITitiral andlligh·It'\t'I, 
of so!1v\im' tlt'vl'iupmt'nt. The kit 
int'ludt'S tlle\S~Hl(iO 
\s.~I'lIIhlt'I' and thl' iC·960 C 


l,anglJagl' Compllt'j'. 


SOITWARE 
DEJlELOrMENT: 
S'I'AR'I'ER «1'1' 4 


Tht' Sol'£lIan' Dewlopmt'nt Starter I..il "'~ Includes 
:\SMfl!iOM ,\ssl'l1ll1ll'l' anil Cfl()O\l C Compilt'l' hOSlftl on a 
\1imI\.\X/\ 
~IS. It I'M id!'Sall of lhc softw;trt' neededto wt 


slal'lt,1 i1cvt'lupingan 8Ofl60 applicillion. 


SOITWARE 
DEJlEf,OrMENT: 


S'I'AR'I'ER «1'1' 5 


Tilt' Softwart' Ilt'\l'lopmrnt 
Starll'r kit "'" includl'S 


:\SMH60\ AsSt'lIIhlt'r and CflIlO\ C Compilt'r hoslt'd on 
\'\X/\·~IS. It providt'S all or tht' softllare ntxxled tu get 
Slarted dt'vt'loping an 1l091l0application. 


FAS'I' ArpUCA'I'ION 
DEJlEWrMEN'I': 
S'I'AR'I'ER IN.I'I'6 


Tilt' ~itst 1)('\1'iupml'nt St;wter 1\1t.SWrter "it 
"6, 
!Inlvides a 


clt'Vt'llIpnll'nt IInll' spex'dIlIlpl'()\('ment fill' cuswmers that 
already 0\\ II :\S~If)(\()Il. Thl' kit iliclutlt'S the ~M hyte 
ext'('ution whiC'lt', ~:\·\f)(\Okll·l~lIl, and the e\t'Cution whlclt' 
hostnl C Compilt'l', CH60~:f~ 


COMPI,E'I'E FAS'I' APPUCA'I'ION 
DEJlELOPMEN'I': 
S'I'AR'I'ER IN.I'I'7 


Thl' (AImpit'll' Fasl DI'wlopnlt'nt StMter Kit, Starter I\lt "'7, 
prtlVidl'S a I\lmplt'll' 
alld fast tit'\l'Iopmenl tlXllkit. The kit 


includl's tlX' ~~I hyte t'\(X'uUon whirle, r:\.-\960KB4MB, 
:\S~19t\()D ·\s.>;l'l11hll'l'allli tht' 1'\(~'lIlion vehicle hosted e 
Compiler, Cfl(iO~:I~ 


SOfTIt'tUfE 
DEJ'EUJrfflENT: 
STAIlTEIlIUT. 


The &1tware Development Starter Kit "'8 
includes 
ASM960U Assemoler and C960U C Compiler hosted on a 
Sun 3 workstation. It provides all of tlle software neededto 
get started developing an 80960 application. 


SIJIIJ'It:E, 
SlJrrtJllT 
AND TRAINING 


Intel augments its 80960 development tools with a full 
array of seminars. classes and workshops; field application 
engineering expertise; and telephone and on·site support at 
all stages of development. 


OIlIlUllNQ 
INJ'OIlMATION 


ASM·960D 
ASM·960 Assemoler contains the 
960SKITI 
assemoler. Iinkert1oader.macro 
preprocessor. archiver. PROMbuilder. and 
other objoct module utilities. DOShosted. 
960SKIT2 
Requires a class J software license 
agreement plus addendum. 
960SKIT2AB 


.••SM960M 
Same as aoove. MicroVAXlVMS hosted. 
Requires a class I software license 
960SKIT3 
agreement plus addendum. 


ASM960V 
Same as aoove. VAX/VMS hosted. Requires 
960SKIT3AB 
a class I sollware license agreement plus 
addendum. 
960SK1T4 
ASM960U 
Same as atlove. SUn 3 hosted. Requires a 
class I sotlware license agreement plus 
960SKIT5 
addendum. 


C960DP 
1C-960optimizing C compiler. witll ANSI 
960SKIT6 
extensions for the emoedded applicatkms. 
contains standard STOIOlioraries and in· 
line assemoly capability. DOShosted. 
Requires a 2M byte Above"'Board. 
960SKIT7 
C960EP 
Same as above, execution vehicle 
900SKIT8 
(EVfI960KB4MB) hosted. 


C900M 
Same as atlove, MicroVAXIVMS hosted, 
C900V 
Same as above, VAXlVMS hosted. 
C900U 
Same as above. Sun 3 hosted. 
EVA960KB 
Software development and execution 
vehicle for the 80960KA and 80960KB 
microprocessors. contains 1M byte DRAM 
program memory, 


EVA860KB4MB Software development and execution 
vehicle for the 80960KA and 80960KB 
microprocessors. contains 41.1byte DRAM 
program memory. 


Architecture Evaluation Kit includes 
EVA900KB execution vehicle plus 
ASM·900D Compiler. 
Same as atlove plus iC960DP Compiler 
(requires Above'"Board) 
Same as SKIT2 plus Intel Above"'Board 
with 2M byte memory 
Contains ASM·960D and iCOOODP 
Compiler 
Same as above plus Intel Above'l'Board 
witb 2M byte memory 
Contains ASM960M Assembler and C960M 
C Compiler. hosted on MicroVAX/VMS. 
Contains ASM960V Assembler and C960V 
C Compiler. hosted on VAXIVMS. 
Fast development kit contains 
EVA960KB4MB execution vehicle and 
C960EP execution vehicle hosted C 
Compiler. 
Same as above plus ASM960D Assembler, 
Contains ASM960U Assembler and C960U 
C Compiler, hosted on Sun 3. 


~~~!LOMO~~~if 
1\01\-960 DEVELOPMENT ENVIRONMENT FOR THE 80960 
I 


tI COMPUTE 
tIlM SOWTION 
FOR IlEtiL-TlItlE 
EItIBEDDED 


tlPPUCtiTlONS 


Ada-960 from Intel is a complete Ada development environment for 80960MC based real-time. 
embedded applications. 


The 80960MC is a high petformance. 32-blt military embedded processor especially designed to 
support Ada in fault-tolerant. shared-memory multiprocessor applications. 


Ada-960 is hosted on VAXNMS.· The cross-development environment inlclldes a highly optimizing 
Ada cross-eompiler. a linker. a librarian. a source-level symbolic debugger. a target monitor. 
predefined packages and subprograms. the Ada run-time system. a user gUide. and a detailed run- 
time system implementor's gUide. The run-time system makes optimal use of the Ada support built 
into the 80960MC processor and is carefully designed for real-time embedded applications_ 


I'EtITIJIlES 


• 
Complete VAXNMS· hosted Ada cross 
development environment 


• 
Makes optimal use of the Ada support offered 
by the 80960 MC 
• 
Run-time system is small. fast and 
predictable for real-time applications 


• 
Designed for embedded applications with a 
highly optimizing compiler. selective linking. 
highly modular rcconfigurable run-time. and 
source level symbolic debugger interfacing to 
a target monitor. or an emulator 


71lE BOflNMI: 
tlND tllM: tI 600D 
IfIAI't:B 


The Intel 80960MC embedded proo:ssor is designed to 
support applications wriu.en in Ada. Ada-900 from Intel is 
implemented to make optimal use rl the Ada support. built 
into the 8096OMC. 
• Ada-900 maps Ada tasks directly to 80960MC proresses. 
• Ada-900 uses tile 80900MC embedded proo:ssor 
hardware to dispatch and manage Ada tasks. 
• Ada·900 maps Ada task priorities directly to B0960MC 
process priorities. 
• 
Ada·900 uses the 80960MC memory managment unit to 
provide inter-task protection. 


• Ada-900 uses 80960MC semaphores to implement run· 
time system critical sections. 
• Ada-900 uses the 80960MC on-chip floating point unit to 
perform floating point operations. 


The unique architecture of the 80960MC allows Ada·960 tn 
use tile proo:ssor hardware to provide functionality 
normally impiemented In software on other archItectures. 
This includes automatic dispatching and pre-emptive 
priority scheduling of Ada tasks. 


Ada-960's IJlleof the 80900MC makes the run-time casily 
extensible to support fault-tolerant shared-memory 
multiprocessor oonrJgurations supported by the 80960MC 
embedded proo:ssor. 


tllM-tlN 
I'OIlllEtI£-711f1E 
tlPrutJtl7lONS 


Ada-900 is carefully designed for use in the development of 
real-time applications. Some of the real-time features of 
Ada-900 include: 


~_ 
H~ 
en" 1'1_: The Ada-960 
run-time system disables interrupts for a minimal amount of 
time. The "Interrupts of!'" time does not vary wltll the size of 
the application. 


rre-e."dwe ,.,.,.,.." 
SdIetI.II •• : Ada·900 
provides a fully pre-emptive, priority-<lriven tasking run- 
time. The 80960MC hardware is used to ensure that tile 
highest priority task is always the one that is running. The 
run-time system uses the 80960MC hardware to switch to a 
higher priority task (than tile one currently executing) 
whenever such a task becomes ready to run. 


~Ne~: 
Ada-900 provides 
predictable performanoe that is insensitive to target system 
load. System response time remains constant and fully 
deterministic as the numher of tasks etc. grows, 
• Ada-960 ensures that scheduling latency is independent 
of system load. 
• Ada-960 guarantees response-time to interrupts. 
• Ada-960 provides predictable memory allocation times. 
Memory allocation Is Implemented through efficient 
algoritllms that ensure a constant upper bound on the 
time taken to allocate memory. 


"••-a_ldee.8"': 
Ada-900 provides a run·time 
system extension package that gives applications dynamic 
control over tasking, scheduling, critical sections and other 
run-time functions. 


tllM-flBO 
I'OIl EMBEDDED 
tlPPut;tlf'lONS 


Ada-960 is designed for embedded applications and 
provides: 


S_II. #'list••• .,,_ 
S"su.: 
Ada-960 provides a 
(X)mpact,high performanoe run-time system. The run-time 
system Is very modular to support selective linking by the 
Ada-900 linker. The modularity of the run-time and the 
selective linking features of the linker ensure that all unused 
Ada language features are automatically omiu.ed from the 
application'S final executable Image. 


1Iela•.•••• 1e"•.-u_ S"su.: 
Ada-900 
provides an easily retargetable run-time system. The ron- 
time system Is designed to be easily retargeted to custom 
80900MC boards and comes with all the necessary source 
files and documentation ror on-site custnmi7.ation to s~irIc 
interrupt and I/O requirments. 


t''''XII'MS"-te4 ae..ce De".: 
Ada-960 
provides a VAXNMS-hosted source-level. symbolic Ada 
debugger. The debugger allows uscrs tn debug applications 
on a remote 80900MC target via Its interface to either the 
standard Intel 80900MC monitor or the standard Intel 
80960MC in-eircuit-emulator. 


aellJl'~.1e 
n.r~1 .••••n.r: Ada·900 provides 
an casily retargetable target monitor. The target monitor 
resides on the target board. The monitor oommunlcate8 witll 
and supports tile Ada debugger hosted on VAXNMS. The 
target monitor is easily retargetable to custom boards. This 
allows the Ada debugger to be used in debugging 
applications on non-standard 80900MC hardware 
configurations. 


1lOitIII.1e (;fHIe 11M 811110: The Ada'960 oompiler 
produces ROMablecode. The Ada linker can produce an 
application's executable image in 86Hex so that the 
application may be easily burnt into PROM's. 
e••••• '••..•tfII.114 _ 
..•11. e.tIe: 
Ada-900 
provides implementation-<lefined pragma FOREIGN·BODY 
and pragma LINKAGE-NAME to support the combination of 
Ada and non·Ada code. 


Cllafller'3 s.,.".rc: 
Ada·960 proVides chapter 13 
support Including representation specifications, machine- 
code insertion, interrupt entries, and so on. 


t:IlfJSS-DEt'EUJP/fIENT 
ENt'lIlON/fIENT 


The Ada-960 cross development environment includes tools 
for compiling. linking. and debugging. along witb libraries 
and a complete set of documentation. The Ada-900 cros.~ 
development environment from Intel makes tbe follOWing 
support. documentation. tools, and software available: 


fA.llllft': A fast. highly optimizing Ada-960 compiler 
that generates efficient. compact 80900MC code, 


The compiler performs Virtually all optimizations that are 
"traditional". 
as well as several that are Ada·specific. Each 
optimizatiun is carefully tuned W the 80000MC archiwcture. 


Optimizations performed include transfurmations that affect 
value and variable handling. code motion and elimination. 
tail recursion elimination. and loop strength reduction. Ada- 
specific data packing and code transformations such as 
constraint check elimination. overflow check climination, and 
parameter binding are also perfurmed. The Ada-900 code 
!,'eneraWrschedules generated machine Instructions W make 
optimal use of parallel execution opportunities available on 
the 80960MC embedded processor. 
"""'''''a: An Ada-960 librarian W manage the Ada 
program library. The librarian controls the Interaction of 
compilation units and tbe linking of executable images. The 
Ada-900 librarian supports the Ada separate compilation 
and dependency control requirement. 


Ualia': An advanced Ada-960 linker to support enhanred 
selective linking at the subprogram level. Subprograms that 
are not used in an Ada application are not linked unless 
specifically requested. ~'ull control over memory layout and 
mapping is supported with a rich command language. 


DelHI••••• : A very powerful Ada-900 source-level. 
symbolic Ada debugger that supports the debugging of both 
pure Ada and combined Ada code. The Ada·OOOdebugger is 
hosted on VAXNMS and interfaoes with target 80960MC 
boards through either the standard Intel 80900MC target 
monlWr or the standard 80900MC in-eircuit-emulator. 


The Ada·900 debugger allows users to examine and modify 
their applications using the same names that appear in the 
source program. Users can evaluate Ada expressions. set 
breakpoints and traoepoints. and debug multi-tasking Ada 
programs. 


Program breakpoints can be made conditional on arbitrary 
conditions. and debugger commands can be executed 
automatically at breakpoints. 


The Ada-gOOdebugger can call functions and procedures in 
an Ada applir.ation. This fr.ature can be used to extend the 
set of dcbugger facilities or to test parts of the application 
interactively. 


The Ada·960 debugger allows users to display Ada 
variables in formats appropriate W each of their types. 
Users can also specify formats appropriate W the current 
application. Special hrowsing features within the debugger 
eliminate the need for paper listings during debugging 
sessions. 


The Ada·960 debugger provides a Oexibledisplay of the 
state of Ada tasks and can display the current callstack 
within any task. The debugger can list tasks on various run· 
time queues and can suspend or change the pnority of 
tasks. 


The source level features of the Ada·960 debugger are 
complemented by a complete set of machine-level 
commands. 


SCript files containing debugger oummands may be created 
in and executed by the debugger. The Ada·900 debugger can 
record a log of all debulliling actions fur later analysis or 
replay. 


Fur beUer pfOl:rammer productivity. the dclJugger has a 
multiwlndow interfaoo wiUI separate Windows for debugger 
commands. Ada source and program output. The Ada·900 
debugger provides "scoreUuard windows" for real·time 
display of user-selectedprogram information. 


The debugger is usable from all types of terminals and has 
special features W support bit-mapped displays. 


Useof the debugger is possible at all optimization levels 
without recDmpiiation of the Ada program. 


ntrld ••••• 
r: A standard Intel 80900MC target 
monitor that is easily retargetalJle W cuSlDm80960MC 
boards. The targft moniWr comes with all the nccffisary 
source files and documentation for on·site cUSlDmizationW 
specific interrupt and 110requiremcnts. 


I€E"': A standard Intel 80960MC in-eircuit-emulaWr. The 
in-eircuit-emulaWr delivers real·time emulation at processor 
speed, and allows non-intrusive debugging of applications 
under development. 


Pretildillftll"adillllftl 
Ma"s.."....,.,..s: 
An 
Ada·900 library containing precompiled predefined Ada 
packages and subprograms. 


R.a-U-e Syflk.: An Ada·900 library containing the 
Ada run·time system. The run-time system is small, fast, 
and predictable. The Ada-900 run-time Is re·targetable W 
different 80960MC boards and is especially designed for 
real-time embedded applications. The run·time system is 
carefully designed to make optimal use of the Ada support 
proVided by the 80900MC embedded processor. 


The run-time system comes with all the necessary source 
files and documentation for on·site customization to specific 
interrupt and 110conventions. Most of the run·time system 
is written in Intel Ada·900 with a small portion written in 
Intel ASM·960 assembler. 
•. ..u..e Srsu- Bx~: 
A run-time system 
extension package to allow Ada applciations dynamic 
control over the tasking, scheduling, critical sections, and 
other run·time functions. 


s.tIrve (;efIe: Source code for both the run-time system 
and the run·time system extension. 


••• 
~,..: 
High quality documentation including 


a User Guide and a detailed Run·time System Implementor's 
GUide.The Run·time System Implementor's Guide provides 
full documentation ur the Interface bet~n 
the Ada·900 
compiler and the run·time system. It also documents the 
design ur, and provides guidance to. modirying the run-time 
system. 


WfJIl£DtJ'IDE SERJ'lCE AND SlJPPfJIl'l' 


Intel augments its 80960 architoeturc family dcw.lopment 
tools with a full array ur seminars, classes, and workshops: 
on·site consultin~ services: field application engineering 
expertisc; telephone hot·line support; and software and 
hardware maintenance cuntracts, This full line of services 
will ensure :,uur design success. 
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IN-CIRCIJIT 
EMIJLATOR FOR THE B0960l&A 
AND B09601U1 


MICROPROCESSORS 


The ICEnf-960KB In-Circuit Emulator delivers real-time hardware and software debugging 
capabilities for 80960KAlKB-based designs. The capabilities include emulation of the 80960KAI 
KB microprocessor, hardware and software breakpoint specification, fastbreaks, two types of 
trace capability, large trace buffering and sophisticated human interface. The ICE-960KB In· 
Circuit Emulator gives you unmatched control over all phases of hardware/software debug, 
including developing, integrating and testing, which improves the developers productivity and 
speeds time to market. 


FEATIJRES 


• Real·Time Emulation of the 80960KA/KB 
microprocessors up to 20MHz 
• 256K bytes of memory in Standalone self 
Test Unit 
• Zero wait-state operation from user system 
memory 


• Examine and modify memory and the 


80960 Registers 
• 2 hardware and 32 software Breakpoints 


settable on any Instruction Address, and 
break on Trace Buffer Full 
• Hosted on IBM PC-AT· running DOS 
(version 3.3) 
• Assembly and Disassembly of code in 
80960 instruction mnemonics 
• Dynamically monitor or update program 


variables or memory with Fastbreaks 


• Real-time Bus Trace with Time·Tags for 
tracking code execution times 


• Execution Trace for tracking instruction 


execution inside on-chip Instruction Cache 


• Stores 1024 frames of program execution 
history or bus cycles or both 


• Versatile software featuring Color, 
Puildown Menus, Forms, Command Line 
with Syntax Guidance and Editing, Control 
Constructs, Debug Procedures and DOS 
Command Input (sheil) 


Intel Corporation assumes no responsibility for the use of any Circuitry 
other than circuItry embodied In an Intel product. No other circuit patent licenses 
are implied. Information contained herein supersedes previously published speciricatlons liD these devices from Intel and is subject to change without 
notice. 


-~-------- 
REAL-TIME 
EMIJLATION 


The ICE-960KB In-Circuit Emulator provides emulation 
of the 80960KAlKB at speeds up to 20 MHz, thus 
providing early detection of subtle timing problems that 
may arise at full speed. Intel's intimate knowledge of the 
component makes possible the tightest conceivable 
conformance between timing parameters of the emulator 
and the target microprocessor. 


PROCESSOR/MEMORY 
EXAMINATION 
AND MODIFICATION 


The 80960KA/KB registers can be accessed 
mnemonically (e.g. gl2, 1'5,fp3) with the ICE·960KB 
emulator software. Data can be displayed or modified In 
one of four bases (hexadecimal, decimal, octal, or 
binary). Program memory contents can be disassembled 
and displayed as 80960 assembly Instruction 
mnemonics. Additionally, 80960 assembly Instruction 
mnemonics can be assembled and stored into program 
memory. 


PROGRAM 
TRACING 


The ICE-960KB emulator can store 1024 frames of 
program execution history or 5120 cycles of the 
80960KAlKB address/data bus activity in the trace 
buffer. Each frame of program execution contains a 
discontinuity address (branch. call, return. etc.), and a 
time-tag. This Information can be used to reconstruct a 
history of the program execution. With the execution 
trace option enabled, the ICE-960KB will run at less than 
full speed; typically 70-90% of full-speed. Each trace 
frame of bus cycles contains one complete bus burst 
access, the address cycle followed by the four data 
cycles, and a time·tag. While using bus trace, the 
ICE-960KB runs at the full-speed of the 80960KAlKB 
microprocesso~. 
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EVENT RECOGNITION 
(BREAItPOINT 
CONTROL) AND EMIJLATION CONTROL 


Two hardware and thirty-two software breakpoints can 
be active at any time. The ICE-960KB emulator allows 
any number of breakpoints to be defined and then 
activated when needed. The breakpoints can be set on 
any instruction address. Additionally, emulation can be 
automatically stopped when the trace buffer is full. 
Besides the ability to execute program code at full speed 
between specified points, the ICE·960KB emulator 
proVides the capability to single·step through program 
code. Fastbreaks are short pauses In program execution 
to examine or modify memory or 80960 registers. 


STANDALONE OPERATION 


Product software can be developed and debugged prior 
to and Independent of hardware availability with the 
Standalone Self Test unit (SAST),which contains 256K 
bytes of two wait-state program memory. The SASTalso 
provides diagnostic testing to assure full functionality 
of 


the ICE-960KB emulator. 


VERSATILE AND POWERFIJL HOST 
SOFTWARE 


The easy to use ICE-960KB emulator software takes 
advantage of color and pull-down menus to complement 
its already powerful command set. The software includes: 
an on-line help facility, a dynamic command entry and 
syntax guide, screen oriented editor, assembler and 
disassembler, input/output redirection, command piping, 
DOScommand entry, and the ability Locustomize the 
command set via debug procedures and literal 
definitions. 


DEBIJG PROCEDIJRES 
AND LITERALS 


Debug procedures (PROCs)are user-defined groups of 
ICE-960KB emulator commands. They can be stored on 
disk and recalled during later debugging sessions. PROCs 
can be used to simplify the process of debugging by 
grouping repetitive or a required ordering of emulator 
commands. which can then be accessed by typing the 
name of the PROC.Literals are user-defined 
abbreviations for whole or partiallCE-960KB 
emulator 
commands. Literals are a shorthand method of 
customizing the emulator commands Lofit your needs 
and preferences. 


HOST REt)IJIREMENTS 


IBM PC-AT* (minimum requirements) with 640KB of 


conventional memory 
1MB of RAM (Lotus, Intel, Microsoft expanded memory 
specification) 


(Intel's Above™boardwith 1.0MB RAM is required) 
20 MB Fixed Disk 
At least one 5114' Floppy Disk drive 
A serial interface 
80287 Numerics Coprocessor 
DOS Operating System (version 3.3) 
5-9 


REt)IJIRED 
SYSTEM RESOIJRCES 


The ICE-960KB emulator requires the follOWing:a) 
exclusive use of the 80960KAlKB's on·chip debug 
registers and b) 304 bytes of target system RAM in the 
register save area of the stack, 256 bytes for flushing the 
80960 local registers, and 48 bytes for saving the 
processor control block (PRCB). 


s .- •.: c: •• ,' • c: \ '.' • 0 , S 


Width 
Hell:hl 
Lelll:th 
Well:lJl 


VIIIl 
filches 
em 
filches 
em 
filches 
em 
fils 
"I: 


Control unit 
10.5 
26.7 
1.5 
3.8 
16.0 
40.6 
6.0 
2.72 


Processor module* 
3.8 
9.6 
1.5 
3.8 
5.0 
12.7 


SAST 
6.0 
15.2 
2.0 
5.1 
8.0 
20.3 
3.5 
1.59 
OIB 
3.8 
9.6 
.9 
23 
5.1 
13.0 
Power supply 
2.8 
7.1 
4.2 
10.7 
11.0 
27.9 
4.7 
2.14 
User cable 
22.0 
55.9 
SCrial cable 
12.0' 
3.66m 


Ian 


I 
I 
t 
D33 
1.20 


--~ 


fl 


0 


u 
0 
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ELECTRICAL 
SPECIFICATIONS 


SYNC Lllle Speclflcatloll 
The SYNCIN line must be valid for at least one 
instruction cycle because it is only sampled on 
instruction boundaries. The SYNCIN line is a standard 
TTL input. The SYNCOUTline is driven by a TTL open 
collector with a 4.75K-ohm pull-up resistor. 


ADIDC Speclfleatlolls 
The following tables describe the DC specification 
differences between the ICE-960KB emulator and the 
80960KA/KB microprocessor, for more details refer to 
the User Guide. 


. I'r 
LI"-I 


Figure 2: Optional Isolation 


TABLE 2. AC Specifications With The OIB Installed 


Symbol' 
Pa,.amere,. 
1t".I••••• 
It/ad ••"m 


l2 
clock 101> lime 
l2+ 
InS 


l:l 
clock hi~h lime 
l3+ 
Ins 


l6 
oUlput \alid o('la) 
\ID 0::11 
l6+8ns 
l6+ 
16Ns 


DT/R·, Df;1\'.I:lEO·a '. 


ADS',W/R' 
t6 + 7nS 
t6 + 14ns 


III,DI\.CACHE.I,OCK·,II\TA· 
t6 + 6nS 
l6+8nS 
ALf;' 
t6+ 
IOnS 
t6+ 20nS 


t7 
ALE· width 
t7·6.5n8 


t8 
ALE' disable delay 
l8+8n8 
t8+ 
14nS 


19 
output float delay 
AID0:31 
19+ 5nS 
19+22nS 


DT/R',DEI\ ',111';0·3', 
'\D8·,\~/R' 
19+ 7nS 
L9+15ns 


HWA,C'\CHf;,LOCK ',II\TA· 
t9 +6nS 
19+8nS 


tlO 
input selup 
I 


AID 0:31 
tlO+2nS 


I:lI\DAC·,1\'1'0-:1, deass('rllon 
tlO+ 
14nS 


tli 
Inpul hold 
I\ID o:a I. IIOW 
tli 
+ 6nS 


I:lAIHC',I\TO-3',REI\DV· 
III + 7nS 
tl6 
resel "'lup 
lime 
tl6+6 


Symbol Parame'er 
"'admllla 


PM-I"" 
Supply 
current 
IIith 80960KB-20 
1400mA 
018-1"" 
Supply current 
PM-I"" + J 100mA 


(without om installrll) 
("ith (JIll installrd) 
IIh 
", 
IIh 
", 
fitlMlla' 
Maximum 
"'axlmum 
Maximum 
"'axlmum 


AO(0:31) 
100 p.A 
0.6 m!\ 
20 p.A 
- 
J mA 
DEN# 
40 p.A 
1.0 m!\ 
20p.A 
-1 
mA 
IV/R# 
140 p.A 
1.6 m!\ 
20 p.A 
-1 
mA 
AOS# 
140 p.A 
1.6 m!\ 
20 p..\ 
-1 
mA 


CLK2 
80 p.A 
2.2 m<\ 
50 p.A 
-2 
mA 
RESET 
50 p.A 
-2 
mA 
BE(0:3)# 
20 p.,\ 
-1 
mA 
DT/R# 
20 w\ 
-I 
mA 
E\TO#,I'\T3# 
20 p..\ 
',-I mA 
INT1,INT2 
20 p.A 
-1 
mA 
BAOAC# 
20 p.A 
•..-1 
mA 
ALE# 
20 p.A 
-1 
mA 


LOCK# 
20 p.A 
-I 
mA 
READY# 
20 p.A 
-1 
mA 
HOLD 
20 p./\ 
-1 
mA 
FAILURE# 
20 p.A 
-1 
mA 


Power Supply 


100-120V or 220-240V 
(Selectable) 
50-60 
Hz 
2 amps (AC Max) @ 120V 
1 amp (AC Max) @ 240V 


Enf'lronmental Characteristics 


Operating 
Temperature 
10 C to 40 C (50 I" to 104 1") 


Operating 
Humidity 
Maximum 
85% 
Relative 


Humidity, 
non·condensing 
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OnlerCode 


ICE960KB 
Description 


The complete 
ICE-960KB 
emulator 


system 
including 
control 
unit, 


processor 
module, 
power supply, SAS'I: 


OIB, SAB, scrial 
communications 
cable 
(SCOM4), IEDIT, software 
vcrsion 
I.x, 


and upgradc 
cert ificate 
for version 2.0 


software. 
(Rcquires 
softw<lre license, 
Class I) 


The complete 
ICE-690KB 
emulator 


system 
including 
control 
unit, 


processor 
module, 
power supply, SAST, 


OIB, serial communications 
cable 


(SCOM4), IEDIT, soft\\'al'e 
version 
1.x, 


upgrade 
certific<lte 
for version 2.0 
softw<lre, 
<lnd 2MB Aboveboard. 


(Requires 
softw<lI'C license, Class I) 


The complete 
ICE-960KB 
emulator 


system 
including 
control 
unit, 


processor 
module, 
power sllpply, 
SAST, 


018, serial communications 
c<lble 
(SCOM4), IEDIT, software 
version 
1.x 
(version 
2.0 software 
is not included). 


(Requires 
software 
liccnse, Class I) 


DOS hosted assembler, 
linkerlloader, 


macro 
preprocessor, 
archiver 
(librarian), 
PROM bUilder, and other 


object 
module utilities. 
(Requires 


software 
license, CI<lsS I, plus 


<lddenduml) 


DOS hostcd optimizing 
C compiler, 
with 


ANSI extcnsions 
for embedded 


applications, 
contains 
standard 
STDIO 


librarics 
and has inline assembly 
capability, 
Requires a 2M byte 
Above™board. 
(Requires 
software 


license, Class I) 


For direct 
information 
on Intel's Development 
Tools, or 


for the number of your nearest sales office 
or distributor, 


call 800-874-6835 
(U.S,). For information 
or literature 
on 
additional 
Intel products, 
call 800-548-4725 
(U.S, and 


Canad<l) 


UNITED STATES, Intel Corpor<ltion 
3065 
Bowers Ave,. Sant<l Clara, CA 95051 
Tel: (408) 765-8080 


JAP"I\. 
Intcl Japan 
K.K. 


5-6 Tokodai. 
Tsukub<l-shi. 


Ibaraki. 
300-26 


Tel: 029747-85J 
I 


UI\ITED 
KINGDOM, 
Intel Corpor<ltion 
(U.K.) Ltd. 


Pipers \~ay, SWindon, \~i1tshil'C, England 
SN3 JRJ 
Tel: (0793) 696000 


inter 


ALABAMA 


~n,':~OrctDr .. ll'2 
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T-'; (205)830-4010 


ARIZONA 


n~~5~8thDr. 
SulteD·21. 
Phoenix 
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Tel: (802) 869-4980 
l~~~ 
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Place 


•••• 
30' 
Tuc.on 
85715 
Tel: (602) 299-6815 


CAUFORHIA 
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Street 


Suite 
116 
f:~:8~;~9~~ 
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SUite 218 
~••~f~ 


tlmelCorp 
~'=O~;~IW 
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Drive 
Suite 
105 


~~;(~~4~~~ 


~~~~Avenue 
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~~~~e:s.~~ 
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2700 San Tom •• Expr •• twIY 
2nd FlOOr 
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408-727-2t20 


COlORADO 
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'00 
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~~~~3;';':1 


CONNICTICUT 


~='IlI~Ra.d 
2•• .- 
~~~'1~ 
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200 
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~~.C:fr1;~le 
Road. SUite 400 
ri1~fE 


INDIANA 
l!Fi;'~Rood 
SUIte 
125 
~~~~J~t= 


IOWA 


Intel Corp. 
1930 
51. Andrews 
Drive 
N.E. 


2nd 
Floor 
cedar 
RapldsS2402 
Tel: (319) 393-5510 
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tlntelCorp 


~~~J'210 
Tel: (913) 345-2727 
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; (301) 441·1020 
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~~~ 
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~~~St.,Su~3IO 
Bloomi~ 
55431 
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